From James.Anderson at mecom.mixx.de Wed Apr 1 01:55:26 1998 From: James.Anderson at mecom.mixx.de (james anderson) Date: Mon Jun 7 17:00:20 2004 Subject: Namespaces in XML: 3.1 the example [2] References: Message-ID: <352182A3.3CBE5701@mecom.mixx.de> greetings; re 3.1 (the o/l bookstore example) the discussion raises a number of questions 1. when a namespace-pi binds a namespace, is it intended that, should a schema have been specified, a processor verify (immediately?, later?, when?) the existence (the content?) of the specified schema? is this a well-formedness or a validity issue? 2. if the schema is present, should the processor permit local additions to the namespace, that is the introduction of names which are not present in the external definition? should the processor permit redefinition of existing names from the namespace? if the answer to first is "no", then cross-references are no problem. if the answer to the second is yes, then it would be possible to place hooks in a dtd by selective entity placement, which entities the using document/dtd would be free to (re)define. (or rather, it's almost possible: there's a small problem, that the wd-standard precludes qualified entity names. why?) 3. the element definition examples below shouldn't, in any event, appear in the original schema(s). while it is ok (and necessary) to constrain the namespace in definition tags in the internal subset, to do so in the original schema itself would prevent subsequent users of the dtd from remapping the tags to suit their needs. in general this is too restrictive. Chris Smith wrote: > On Mon, 30 Mar 1998, David Megginson wrote: > > > Chris Smith writes: > > > > > > > > > > > 80183589575795589189518915 > > > > > > My question is simply: what is the definition for "Order" ? > > > > You would have to do something like this: > > > > > > > > > > I must admit I had considered this, but had rejected it since it > seemed to require that each DTD exist before the other DTD. In > addition, it hardly seemed reasonable for something as general as > dsig: to know - in the DTD - about all its uses. (This is why I > thought that the use of ANY or #PCDATA might be a way to facilitate > the experiments.) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rtennant at library.berkeley.edu Wed Apr 1 02:12:38 1998 From: rtennant at library.berkeley.edu (Roy Tennant) Date: Mon Jun 7 17:00:20 2004 Subject: Namespaces in XML: 3.1 the example [2] In-Reply-To: <352182A3.3CBE5701@mecom.mixx.de> Message-ID: Please forgive my ignorance here, but I've been trying to figure out what *exactly* is on the other end of this (for example): That is, what is to be found at http://books.org/schema/? A data schema marked up in XML-Data? A Web document? A cup of coffee and a cheese sandwich? Please, inquiring minds want to know...thanks, Roy Tennant xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From amarshal at usc.edu Wed Apr 1 02:30:30 1998 From: amarshal at usc.edu (Andrew n marshall) Date: Mon Jun 7 17:00:20 2004 Subject: Comments on Section 2.6 of XML-Namespaces Message-ID: <01BD5CC3.3EC7EDF0.amarshal@usc.edu> On Monday, March 30, 1998 10:06 PM, Rick Jelliffe [SMTP:ricko@allette.com.au] wrote: > I think it is important to know that the namespace mechanism does not > attempt to solve all problems with sticking names into schemas. > All it does is bundle names with a comon prefix into a bag with a name > (the "ns" attribute") and then point to some other resource which > may contain some interesting information about the schema. I'm not expecting namespaces to solve all problems. The namespace specification is an excellent way to prevent problems when I use >>the elements<< from someone else's DTD. This promotes the reuse of existing DTDs, which not only saves me time, but also means that I can take advantage of the processors written for use on the external DTDs. That is great!! However, I believe the namespace specification is creating more problems than it is worth when it attempts to extend itself to individual atteributes. > Most elements and attributes belong to multiple schemas. This is > both because no one schema language is good enough to define > all the requirements that any single type has, and because of the > symphonic, interrelated nature of the use of elements in documents. I argee completely. That is why XML is a meta-language instead of a specific language. > > So the namespace mechanism does not attempt to provide a general > solution to all these problems. If you are interested in such a thing, then > the HyTime architectural forms mechanism may be of interest to you. > (See http://www.ornl.gov/sgml/wg8/docs/n1920/html/toc.html ) > It is a far more general solution to a fairly similar problem. The namespace > mechanism as proposed is a minimal and modest thing, just enough to > allow RDF and some other applications to progress, and to allow > debate and exploration of the particular issues. I'm sorry. I don't see any relevance between my statements and HyTime. Is there a more specific pointer you could give me, or perhaps an example? > As for as your particular example goes, there is "no guarantee from the DTD > that they mean the same thing" because there is no mechanisms built > into raw XML DTDs to provide such a guarantee: in fact this is why > namespaces are needed--to make it clear that an attribute in one > element type is kin to another. The element declaration is that gaurantee. XML, with the inclusion of the namespace specification at the element level, describes a way to trace each element back to an element declaration from which I can compare wether or not any two elements are related. By this, I am gauranteed that the attributes of each element of the same element declaration have the same possible attributes and are complete enough to be useful with it specific application. However, as soon as you allow elements to be broken up into their individual attributes, this gaurantee goes away. Attribute "hijacking" makes it impossible to maintain the relationship between attributes of a single element, and impossible to maintain the relationship between the attributes and the child elements/content. Therefore, to enable attribute to be reused, you need to group all the attributes and the possible child elements together. This can be acheived through a form of element inheritance. > And in any case, in your particular case of hrefs, the XLink draft provides > an attribute remapping feature. So an href element > > * is attached to its element type in the Instance (& DTD) > > * is bound to a namespace by its prefix and the namespace PI > > * may be remapped to a different name by the Xlink xml:attribute attribute Actually, if you look at the details of the XLL spec (Part 7, second paragraph), attribute remapping is limited to the XLL specific attributes. This is a severe limitation and I believe it should be extend to any apply to any attribute. > * may have additional schemas and semantics added using the > Architectural Forms Deinition Rquirements AFDR mechanism > (See http://www.ornl.gov/sgml/wg8/docs/n1920/html/clause-A.3.html ) > which uses fixed attributes on the DTD in particular. (Architecture = > schema) Thanks for the AFDR reference. I'll try to read up on it. Andrew n marshall student - artist - programmer http://www.media-electronica.com/anm-bin/anm "Everyone a mentor, Everyone a pupil" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From srn at techno.com Wed Apr 1 02:46:28 1998 From: srn at techno.com (Steven R. Newcomb) Date: Mon Jun 7 17:00:20 2004 Subject: Experimenting with Namespaces - DTDs? In-Reply-To: <199803312127.QAA00299@unready.microstar.com> (message from David Megginson on Tue, 31 Mar 1998 16:27:16 -0500) References: <199803301710.MAA01121@unready.microstar.com> <199803312127.QAA00299@unready.microstar.com> Message-ID: <199803312343.SAA01404@bruno.techno.com> David Megginson (ak117@freenet.carleton.ca) writes: > Personally, I'd recommend architectural forms over namespaces if > you're concerned with DTDs, since architectural forms have several > major advantages: David is right. But I would go farther: XML Namespaces are a snare and a delusion. With their use of colon syntax, they lull one into thinking that that are about class inheritance. They are not. Instead, what the namespace thing does is to collapse all the structure of the classes of the inherited-from DTD into a salad of element types which is very correctly termed a namespace rather than an architecture. All that RDF was looking for was a way to guarantee global uniqueness of element type names, and if we ever try to get anything more than that from namespaces, we are on very thin ice indeed. If the inherited-from DTD is already a tag salad, in which all the element types are a big OR group in the content model of the document element, namespaces can work quite well. If, however, an element type has different meanings depending on its context (and most architectures necessarily have this characteristic), then collapsing such an architecture into a namespace can actively interfere with information interchange. I think RDF would benefit substantially, in terms of its understandability, its implementability, and its flexibility, if it were described in terms of inherited architectures. In fact, I think it cries out for an architectural perspective, in which the knowability and significance of element context is preserved. I suspect that RDF's formal rigor would benefit, too, even though its formal rigor is already formidable. (I'm basically impressed by RDF; it's the product of much excellent thinking, I think. I just want MORE!) To be entirely fair and truthful, I must personally accept a share of the blame for this namespace mess; I was present at the first Dublin Core meeting, and, awed by the momentousness of the occasion, I evidently failed to make the case for using architectures for metadata. My later contributions to the W3C XML discussions about namespaces were evidently not persuasive, either. In my own defense, I would argue that this is entirely understandable; it's a subtle issue; nobody has much experience with metadata architectures; what experience there is is dominated by methodologies like MARC that rely on lists of uniquely named fields; and, most of all, the need for even a partial solution to the metadata problem is phenomenally intense. Anyway, all is not lost. This namespace thing is a mistake that will necessarily be corrected, simply in order to support the needs of civilization in an XML-dominated world. The way toward a solution is already paved by an ISO standard (ISO/IEC 10744:1997 Annex A.3) that is being adjusted to accommodate the syntactic limitations of XML (i.e., its lack of #NOTATION attributes). It is implemented in the SP parser and in other software systems, and it is already being used in many industrial contexts. It's the right sort of answer, it's not going away, and its usage is accelerating rapidly; there was a manyfold increase in the number of papers reporting its use at SGML/XML 97. And, anyway, the need for metadata interchange far outstrips RDF's present scope. I hope and believe that many powerful metadata architectures -- including elegant ones that can't be squashed flat and remain useful -- will be multiply inheritable. That way, there can be a marketplace of architectural ideas for metadata in which the full power of context can be exploited. I'd like to see RDF evolve in that direction. -Steve -- Steven R. Newcomb, President, TechnoTeacher, Inc. srn@techno.com http://www.techno.com ftp.techno.com voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137) fax +1 972 994 0087 (at ISOGEN: +1 214 953 3152) 3615 Tanner Lane Richardson, Texas 75082-2618 USA xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From amarshal at usc.edu Wed Apr 1 03:04:40 1998 From: amarshal at usc.edu (Andrew n marshall) Date: Mon Jun 7 17:00:20 2004 Subject: Experimenting with Namespaces - DTDs? Message-ID: <01BD5CC8.0CAAD9E0.amarshal@usc.edu> Can someone please recommend a good introduction to SGML Architectures? Andrew n marshall student - artist - programmer http://www.media-electronica.com/anm-bin/anm "Everyone a mentor, Everyone a pupil" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Wed Apr 1 04:53:59 1998 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:00:20 2004 Subject: Namespaces in XML: 3.1 the example [2] Message-ID: <5BF896CAFE8DD111812400805F1991F701C90F01@red-msg-08.dns.microsoft.com> There is not necessarily anything at the "other end" of "http://books.org/schema/". This is simply a unique name, a URI, which serves to distinguish any names associated with it from all other names on the web. For example, to distinguish, in effect "title, as defined by books.org" from "title, as defined by landuse.org." If there is a document defining the names, that would be referenced by the 'src' attribute, as in . This document might be human-readable text, it might be a DTD, it might be an XML-Data schema, or it might be something else besides. A cheese sandwich, though? Probably not. > -----Original Message----- > From: Roy Tennant [SMTP:rtennant@library.berkeley.edu] > Sent: Tuesday, March 31, 1998 4:13 PM > To: xml-dev@ic.ac.uk > Subject: Re: Namespaces in XML: 3.1 the example [2] > > Please forgive my ignorance here, but I've been trying to figure out what > *exactly* is on the other end of this (for example): > > > > That is, what is to be found at http://books.org/schema/? A data schema > marked up in XML-Data? A Web document? A cup of coffee and a cheese > sandwich? Please, inquiring minds want to know...thanks, > Roy Tennant > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Wed Apr 1 05:03:00 1998 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:00:20 2004 Subject: Namespaces in XML: 3.1 the example [2] Message-ID: <5BF896CAFE8DD111812400805F1991F701C90F02@red-msg-08.dns.microsoft.com> The namespaces design does not specify any particular notation in which a schema would be written. So it certainly does not say that an application must read the schema (if one exists). It takes no stand on what is in a schema, including whether schemas can reference other schemas. This is just the basic material needed to make names unique web-wide, but a lot of work and thinking still needs to be done regarding defining good schema notations, APIs for their use, etc. However, regarding whether an application can "redefine" existing names from a namespace, the answer to that must be "no." The owner of a namespace defines the names in it. These can be processed any way that an application likes, including ignoring the definition, but that is not the same as redefinition. Certainly an application can also map from one named thing to another, as for example architectures allows, but that is mapping, not redefinition. > -----Original Message----- > From: james anderson [SMTP:James.Anderson@mecom.mixx.de] > Sent: Tuesday, March 31, 1998 3:56 PM > To: xml-dev@ic.ac.uk > Cc: tbray@textuality.com; dmh@corp.hp.com; Andrew Layman > Subject: Namespaces in XML: 3.1 the example [2] > > greetings; > re 3.1 (the o/l bookstore example) > > the discussion raises a number of questions > > 1. when a namespace-pi binds a namespace, is it intended that, should a > schema > have been specified, a processor verify (immediately?, later?, when?) the > existence (the content?) of the specified schema? > is this a well-formedness or a validity issue? > > 2. if the schema is present, should the processor permit local additions > to the > namespace, that is the introduction of names which are not present in the > external definition? > should the processor permit redefinition of existing names from the > namespace? > > if the answer to first is "no", then cross-references are no problem. > if the answer to the second is yes, then it would be possible to place > hooks in > a dtd by selective entity placement, which entities the using document/dtd > would > be free to (re)define. > > (or rather, it's almost possible: there's a small problem, that the > wd-standard > precludes qualified entity names. why?) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Wed Apr 1 05:25:42 1998 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:00:20 2004 Subject: Comments on Section 2.6 of XML-Namespaces Message-ID: <5BF896CAFE8DD111812400805F1991F701C90F05@red-msg-08.dns.microsoft.com> Andrew Marshall wrote: "Even in your attempt to rectify this situation with the syntax used in your last example: You still provide no guarantee that there is a meaning for the attribute 'Temp' without possible sibling attributes. Take for example: Does the use of href have any meaning without the 'target' attribute which may be implicitly be defined with the default value of '_self'? Probably not." Thanks for your question, and for reading the document closely. I think you are expecting namespaces to do more than it in fact does. Namespaces simply allows you to distinguish the T.Heat:Temp attribute from the T.Color.Temp attribute. It does not take on the job of expressing grammatical rules such as that the T.Heat.Temp attribute must only be used in conjunction with another attribute, e.g. T.Heat.Units. Namespaces are just a named set of names. Keeping the names distinct is the goal of namespaces. Defining the grammar, semantics, etc. of the named things is beyond the pale. Best wishes, Andrew Layman Microsoft xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecom.mixx.de Wed Apr 1 06:01:04 1998 From: James.Anderson at mecom.mixx.de (james anderson) Date: Mon Jun 7 17:00:20 2004 Subject: Comments on Section 2.6 of XML-Namespaces References: <01BD5CC3.3EC7EDF0.amarshal@usc.edu> Message-ID: <3521BC3C.38ABFCDF@mecom.mixx.de> the issue of 'hijacking' is orthogonal to that of naming. in order for the example to have any meaning, one would need to provide an attribute declaration for the Item element. in which definition the desired relation is (re)established. although i am at a loss as to why one would wish to perform this kind of 'deconstructive' element definition (i.e. i don't know why the wd example matters? doesn't there have to be an attribute list definition for Item, which resolves the ambiguity? remember, we're not dealing with architecture here, just names. ) if one is concerned about element / attribute integrity, then one doesn't 'hijack' attributes. if, on the other hand, one needs to refer to attributes across namespaces, then either qualification or mapping is necessary. the namespace wd proposes qualification: a) for eg., to augment attribute definitions (through definition in the internal subset) for their use in their already defined element context one will need unambiguous names. one could say that the namespace explicit in the element name of an attlist has an implicit scope over the entire attlist declaration of the, but that's not necessary b) to specify matching criteria for a language such as xsl, one needs to specify the original namespace for attributes Andrew n marshall wrote: > > As for as your particular example goes, there is "no guarantee from the > DTD > > that they mean the same thing" because there is no mechanisms built > > into raw XML DTDs to provide such a guarantee: in fact this is why > > namespaces are needed--to make it clear that an attribute in one > > element type is kin to another. > > The element declaration is that gaurantee. XML, with the inclusion of the > namespace specification at the element level, describes a way to trace each > element back to an element declaration from which I can compare wether or > not any two elements are related. By this, I am gauranteed that the > attributes of each element of the same element declaration have the same > possible attributes and are complete enough to be useful with it specific > application. > > However, as soon as you allow elements to be broken up into their > individual attributes, this gaurantee goes away. Attribute "hijacking" > makes it impossible to maintain the relationship between attributes of a > single element, and impossible to maintain the relationship between the > attributes and the child elements/content. > > Therefore, to enable attribute to be reused, you need to group all the > attributes and the possible child elements together. This can be acheived > through a form of element inheritance. > > > And in any case, in your particular case of hrefs, the XLink draft > provides > > an attribute remapping feature. So an href element > > > > * is attached to its element type in the Instance (& DTD) > > > > * is bound to a namespace by its prefix and the namespace PI > > > > * may be remapped to a different name by the Xlink xml:attribute > attribute > > Actually, if you look at the details of the XLL spec (Part 7, second > paragraph), attribute remapping is limited to the XLL specific attributes. > This is a severe limitation and I believe it should be extend to any apply > to any attribute. > > > * may have additional schemas and semantics added using the > > Architectural Forms Deinition Rquirements AFDR mechanism > > (See http://www.ornl.gov/sgml/wg8/docs/n1920/html/clause-A.3.html ) > > which uses fixed attributes on the DTD in particular. (Architecture = > > schema) > > Thanks for the AFDR reference. I'll try to read up on it. > > Andrew n marshall > student - artist - programmer > http://www.media-electronica.com/anm-bin/anm > "Everyone a mentor, Everyone a pupil" > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecom.mixx.de Wed Apr 1 06:02:52 1998 From: James.Anderson at mecom.mixx.de (james anderson) Date: Mon Jun 7 17:00:20 2004 Subject: why do namespaces have such a bad rep (was Re: Experimenting with Namespaces - DTDs? References: <199803301710.MAA01121@unready.microstar.com> <199803312127.QAA00299@unready.microstar.com> <199803312343.SAA01404@bruno.techno.com> Message-ID: <3521BC55.6014F2E7@mecom.mixx.de> why do namespaces have such a bad reputation? in particular, why are they discredited for having been conflated with something which they are not, and which they do not claim to be? Steven R. Newcomb wrote: > David Megginson (ak117@freenet.carleton.ca) writes: > > > Personally, I'd recommend architectural forms over namespaces if > > you're concerned with DTDs, since architectural forms have several > > major advantages: > > David is right. > > But I would go farther: XML Namespaces are a snare and a delusion. > With their use of colon syntax, they lull one into thinking that that > are about class inheritance. why? what does colon syntax have to do with class inheritance? > They are not. i agree. > Instead, what the > namespace thing does is to collapse all the structure of the classes > of the inherited-from DTD into a salad of element types which is very > correctly termed a namespace rather than an architecture. All that the namespace 'thing' maps the names from the "inherited from (sic) DTD" into a unique region of a two dimensional namespace. it says nothing at the structure. > RDF was looking for was a way to guarantee global uniqueness of > element type names, and if we ever try to get anything more than that > from namespaces, we are on very thin ice indeed. > agreed, but it doesn't claim to. > If the inherited-from DTD is already a tag salad, in which all the > element types are a big OR group in the content model of the document > element, namespaces can work quite well. If, however, an element type > has different meanings depending on its context (and most > architectures necessarily have this characteristic), then collapsing > such an architecture into a namespace can actively interfere with > information interchange. the wd doesn't propose to collapse the architecture, it proposes to map the names. > > > I think RDF would benefit substantially, in terms of its > understandability, its implementability, and its flexibility, if it > were described in terms of inherited architectures. In fact, I think > it cries out for an architectural perspective, in which the > knowability and significance of element context is preserved. which all may be true, but doesn't say anything about what namespaces do. > ... > Anyway, all is not lost. This namespace thing is a mistake that will > necessarily be corrected, simply in order to support the needs of > civilization in an XML-dominated world. The way toward a solution is > already paved by an ISO standard (ISO/IEC 10744:1997 Annex A.3) that > is being adjusted to accommodate the syntactic limitations of XML > (i.e., its lack of #NOTATION attributes). It is implemented in the SP > parser and in other software systems, and it is already being used in > many industrial contexts. It's the right sort of answer, it's not > going away, and its usage is accelerating rapidly; there was a > manyfold increase in the number of papers reporting its use at > SGML/XML 97. it also has equivalent mechanisms to manage the same problem within a one-dimensional namespace. (i.e. the problem doesn't go away) some may find the ability to rename an advantage, as it allows one to alter the intended semantics. i wonder whether it as often leads to confusion. where the issue is really name-uniqueness, namespaces are a much more compact expression. there's no reason they couldn't be integrated into sgml architectures - but for the deconstructivist aims, they'd accomplish the same thing as the renaming attribute... why wouldn't people just take them for what they are - orthogonal to the issue of structure, and use them for what they can do? bye for now, james anderson xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From amarshal at usc.edu Wed Apr 1 10:40:35 1998 From: amarshal at usc.edu (Andrew n marshall) Date: Mon Jun 7 17:00:21 2004 Subject: XML Spec error? Message-ID: <01BD5D07.B9579180.amarshal@usc.edu> In section 3.3: "For interoperability, writers of DTDs may choose to provide at most one attribute-list declaration for a given element type, at most one attribute definition for a given attribute name, and at least one attribute definition in each attribute-list declaration." Should this read: "...at most one attribute definition for a given attribute name in an attribute-list declaration,..." As it is currently, it seems to imply that no two elements in a DTD can have attributes of the same name, which is obviously wrong. For example: span class NMTOKENS #IMPLIED > should be legal. Andrew n marshall student - artist - programmer http://www.media-electronica.com/anm-bin/anm "Everyone a mentor, Everyone a pupil" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From amarshal at usc.edu Wed Apr 1 10:44:02 1998 From: amarshal at usc.edu (Andrew n marshall) Date: Mon Jun 7 17:00:21 2004 Subject: Comments on Section 2.6 of XML-Namespaces Message-ID: <01BD5D07.BC1E2B90.amarshal@usc.edu> On Tuesday, March 31, 1998 7:26 PM, Andrew Layman [SMTP:andrewl@microsoft.com] wrote: > Thanks for your question, and for reading the document closely. > > I think you are expecting namespaces to do more than it in fact does. > Namespaces simply allows you to distinguish the T.Heat:Temp attribute from > the T.Color.Temp attribute. It does not take on the job of expressing > grammatical rules such as that the T.Heat.Temp attribute must only be used > in conjunction with another attribute, e.g. T.Heat.Units. Namespaces are > just a named set of names. Keeping the names distinct is the goal of > namespaces. Defining the grammar, semantics, etc. of the named things is > beyond the pale. But aren't attribute names already unique? From the XML spec [3.3], a DTD may contain "at most one attribute definition for a given attribute name [in an attribute-list declaration]". As long as the element names unique, which it seems you have done successfully, then there is no need for your syntax. I believe the syntax you propose introduces more problems than it is worth for the previously given reasons, even if it does not intend to describe semantics. I will make one exception, which it seems you are already aware of: processor specific information that may apply to every element and therefore does not specify a particular element as the source of the attribute. A good example of this usage can be found in the new XLink notation 'xml:link'. Another possible use may be in XML architectures, as in 'Arc:URL' ( realize this isn't the currently recommended form, but I think it makes sense). I guess my complaint boils down to allowing QNames of the form 'namespace:ELEMENT.attribute' as the name of an Attribute. While I don't see any reason for this notation at all, it should at least include a validation constraint to check that the STag uses the QName 'namespace.ELEMENT'. Andrew n marshall student - artist - programmer http://www.media-electronica.com/anm-bin/anm "Everyone a mentor, Everyone a pupil" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Apr 1 11:40:48 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:21 2004 Subject: Y2K and XML-DEV Message-ID: <3.0.1.16.19980401092217.390fe3a0@pop3.demon.co.uk> Residents of the UK will know that the PrimeMinister last week announced the recruitment of 20,000 people to solve the 'millennium bug' crisis in the UK. As I am sure all of you know this bug involves software which represents the year by 2 digits (e.g. '97' for 1997). Unfortunately this is a permissible form in international standards and is therefore encapsulated in much software. Because of the confusion between 'bug' and 'virus', the UK government wishes all prominent WWW sites to be Y2K-compliant (presumably they think the 'bug' can be spread over the WWW?) and this includes XML-DEV. As XMLers will obviously know, identification of dates in pre-XML documents is almost impossible, but random checks will be made by robots from uk.gov on WWW sites. They are particularly concerned with mailing lists. We have therefore been asked to make XML-DEV Y2K compliant. The primary concern - mail headers - will be dealt with centrally, but it *does* affect those of you who include dates in your messages. The most common problem is when quoting someone else's mail - this often gives time and date. Unfortunately a few mailers use only 2 digits, so: PLEASE MAKE SURE THAT ALL DATES IN YOUR POSTINGS ARE EDITED TO HAVE DATES OF THE FORM 1998 (changing to 2000 when appropriate). Since the robots are unlikely to have sophisticated heuristics (they probably use a regex of the form [^9]98) it would be helpful if you could also avoid using the digits '98' in your text. [This is unlikely, but could occur in examples.] P. 1998-04-01 Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Apr 1 11:43:03 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:21 2004 Subject: Proposal for src files In-Reply-To: <3521BC3C.38ABFCDF@mecom.mixx.de> References: <01BD5CC3.3EC7EDF0.amarshal@usc.edu> Message-ID: <3.0.1.16.19980401092254.47ff13ae@pop3.demon.co.uk> I am very pleased to see the namespace draft being discussed here because I think it has important bearing on implementation. I have been privileged to be on the XML-SIG and - without revealing confidentiality - there was a lot of closely argued discussion and this is clearly a tough problem [?tougher than some people initially expected?]. The clarifications given here have been spot-on - the provision is primarily syntactic, providing the identification of components of a name, and the namespace(s) referred to. The distinction between identification of the namespace (ns) and the details of it (src) is very welcome. Specific comment: The following example has been used in the namespaces draft: and suggests some special role for the '.' in the attribute name. There appears to be a widespread practice among SGML authors of using '.' as a means of indicating structure in names. There is no syntactic significance for the '.' in XML and I very much hope that there is no 'implied semantics' given to it. I believe the only reason for it is human readability, and it could be seen as misleading in the current discussion. As far as namespaces are concerned, processing software should treat T.Temp: and Plugh: on equal terms. Clearly anyone wishing to implement namespaces *now* has a problem in that there are no agreed semantics. Since namespaces are (in large part) about interoperability between different document designers (who are probably not in communication) it seems likely that uncoordinated approaches could give rise to confusion. By concern - and I'm sure it's shared by many - is that we get embroiled in 'namespace soup'. What the proposal *does* give us is the ability to find out *who* is responsible for a given tag. In principle, therefore, we can refer to that tagger's description of its semantics. The problem is the lack of standards for semantics (e.g. we don't know what is in the src file). It is clear that some people (e.g. RDF) will develop a de facto approach to the semantics of namespaces. I suspect that in the first instance most of these will lead to documents which cannot be validated against a DTD under XML 1.0 (excepting the ANY content spec). This is my own position - for reasons mentioned on XML-L, CML documents are not constructed to be validatable. Although having experimented with namespaces for some time and found them extremely useful, I don't have any simple answers to the problem. If there are any steps that we can take on XML-DEV to solve *part* of the problem, that could be very useful. Are there any operations which are common to all namespace processing? Is it useful to codify the de facto solutions that people provide so that - at least - we could identify the approach that an author uses? For those of us developing 'src' files is there a useful way of identifying the contents? possibly even standardising them? My own contribution would be to try the following (not to the exclusion of others): There seem to be a significant number of people who expect an src document to be in XML. I belong to them. There are a lot of people who wish the schema to represent the validatory power of an XML DTD. Many wish it *to* be a DTD. Is it possible to combine these two so that we express a DTD in a standard XML notation? Many of us do this already, but I suspect that our tagset and syntax vary. If we could agree on this - and I don't see this as technically difficult - we could help both communities. Those who wished more power than a DTD can provide can extend it with further elementTypes (a la XML-data). Those who wish to use the DTD elementTypes alone can filter them out very easily and - if required - transform them to DTD syntax. Clearly this is only one of many things that could reside in an src file, so some means of identifying it would be required. We should, of course, create our own namespace for the elements - mapped to xml.org :-) Given the achievement of creating SAX on this list, we can certainly solve this one if we wish to. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Wed Apr 1 14:13:25 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:21 2004 Subject: Namespaces in XML: 3.1 the example [2] In-Reply-To: <352182A3.3CBE5701@mecom.mixx.de> References: <352182A3.3CBE5701@mecom.mixx.de> Message-ID: <199804011212.HAA00271@unready.microstar.com> james anderson writes: > 1. when a namespace-pi binds a namespace, is it intended that, > should a schema have been specified, a processor verify > (immediately?, later?, when?) the existence (the content?) of the > specified schema? is this a well-formedness or a validity issue? This is currently undefined -- think of the current namespace WD as a hook for future enhancements rather than a self-contained spec. Namespaces can be neither a well-formedness nor a validity issue with XML 1.0, though the WG might decide to change that in a future version of XML. Personally, I'd prefer to keep namespace processing as a completely separate layer (like TCP on top of IP or HTTP on top of TCP), so that nothing in namespaces will ever affect the basic validity or well-formedness of XML documents. Imagine if every change to HTTP required a change to IP as well! > 2. if the schema is present, should the processor permit local > additions to the namespace, that is the introduction of names which > are not present in the external definition? should the processor > permit redefinition of existing names from the namespace? This would go against the basic principle of namespaces (globalisation and uniquification of names), since two documents could create different extensions to the same namespace. Of course, there's no standard mechanism to verify this right now. I'm not certain that I understand the issue here -- why would someone not bring additional element types in from a different namespace, instead of adding private extensions to an existing one? > (or rather, it's almost possible: there's a small problem, that the > wd-standard precludes qualified entity names. why?) The namespace spec allows element type names, attribute names, and PI targets to be associated with a URI. (External) entity and notation names are already associated with a URI. > 3. the element definition examples below shouldn't, in any event, > appear in the original schema(s). while it is ok (and necessary) > to constrain the namespace in definition tags in the internal > subset, to do so in the original schema itself would prevent > subsequent users of the dtd from remapping the tags to suit their > needs. in general this is too restrictive. Absolutely correct. Currently, XML documents have only one schema language -- DTDs -- and each XML document may have only one of those. Nothing in the namespaces WD (or anything else) is allowed to override XML 1.0, which is already a W3C recommendation. However, if DTDs are used as namespace schemas in the future, then I would assume that the prefixes will _not_ be hardcoded in those schemas. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Wed Apr 1 14:22:06 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:21 2004 Subject: Experimenting with Namespaces - DTDs? In-Reply-To: <01BD5CC8.0CAAD9E0.amarshal@usc.edu> References: <01BD5CC8.0CAAD9E0.amarshal@usc.edu> Message-ID: <199804011221.HAA00276@unready.microstar.com> Andrew n marshall writes: > Can someone please recommend a good introduction to SGML Architectures? Eliot Kimber has some very useful tutorial information up on his web site -- to locate this kind of thing, you should always start with Robin Cover's SGML/XML Web Page, which will point you to anything out there: http://www.sil.org/sgml/ If you don't mind paying US$39.95, or if you prefer books in general, you can grab a copy of my new book, STRUCTURING XML DOCUMENTS (Prentice-Hall 1998), which devotes all of part 4 to architectural forms: http://www.prenhall.com/ptrbooks/ptr_0136422993.html All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Wed Apr 1 14:38:04 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:21 2004 Subject: Experimenting with Namespaces - DTDs? In-Reply-To: <199803312343.SAA01404@bruno.techno.com> References: <199803301710.MAA01121@unready.microstar.com> <199803312127.QAA00299@unready.microstar.com> <199803312343.SAA01404@bruno.techno.com> Message-ID: <199804011237.HAA00298@unready.microstar.com> Steven R. Newcomb writes: > But I would go farther: XML Namespaces are a snare and a delusion. > With their use of colon syntax, they lull one into thinking that > that are about class inheritance. They are not. Right: what they _are_ is a way of uniquifying and globalising names, and I think that they will do an acceptable job of that. The problem comes if people try to do anything else with them (such as specifying inheritance, embedding documents in other documents, etc.). > I think RDF would benefit substantially, in terms of its > understandability, its implementability, and its flexibility, if it > were described in terms of inherited architectures. In fact, I > think it cries out for an architectural perspective, in which the > knowability and significance of element context is preserved. I > suspect that RDF's formal rigor would benefit, too, even though its > formal rigor is already formidable. (I'm basically impressed by > RDF; it's the product of much excellent thinking, I think. I just > want MORE!) Agreed: I would like to be able to include meta-data without being forced to use the element type names that RDF chose. XML-Link got this one right. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Wed Apr 1 14:50:15 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:00:21 2004 Subject: Comments on Section 2.6 of XML-Namespaces Message-ID: <005901bd5d6c$e8f651f0$7a0b4ccb@NT.JELLIFFE.COM.AU> -----Original Message----- From: Andrew n marshall >> As for as your particular example goes, there is "no guarantee from the DTD >> that they mean the same thing" because there is no mechanisms built >> into raw XML DTDs to provide such a guarantee: in fact this is why >> namespaces are needed--to make it clear that an attribute in one >> element type is kin to another. > The element declaration is that guarantee. XML, with the inclusion of the > namespace specification at the element level, describes a way to trace each > element back to an element declaration from which I can compare whether or > not any two elements are related. By this, I am guaranteed that the > attributes of each element of the same element declaration have the same > possible attributes and are complete enough to be useful with it specific > application. Namespace declarations apply to multiple element types. You probably could get the same result using HyTime architectural forms definitions and sticking #FIXED attributes on each individual element type declaration in the markup declaration, it is true. But namespaces need to work in documents without XML markup declarations, and it needs to be terse enough so that it can be set once. (I certainly agree that there are much more interesting things lurking under the surface of the namespace issue, but no-one seems to dispute that.) So element type declarations cannot be a guarantee of anything. > However, as soon as you allow elements to be broken up into their > individual attributes, this gaurantee goes away. Attribute "hijacking" > makes it impossible to maintain the relationship between attributes of a > single element, and impossible to maintain the relationship between the > attributes and the child elements/content. I dont think I agree with your ideas of "hijacking". An attribute is whatever the designer has said it is, for better or worse. E.g., if a document type designer says that all elements types will have an attribute which gives the line number of the element type declaration in the original document, then that attribute has nothing to do with the element type itself, and everything to do with the artifacts of the declaration of that type. Such an attribute has its meaning without any reference to any particular element type being defined. So some attributes are highly coupled to their type, some are highly uncoupled. But perhaps you might agree with me that the way to prevent this problem (i.e., where an attribute in one element is named partially using another element name) is for the original designers of the schema to use explicit namespace qualifiers for all attributes which are not strongly coupled to the element type. These are the particular attributes which perhaps are most likely to be detached. > Actually, if you look at the details of the XLL spec (Part 7, second > paragraph), attribute remapping is limited to the XLL specific attributes. ... That is why I prefaced my comment with "And in any case, in your particular case of hrefs, the XLink draft provides an attribute remapping feature." Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecom.mixx.de Wed Apr 1 15:54:57 1998 From: James.Anderson at mecom.mixx.de (james anderson) Date: Mon Jun 7 17:00:21 2004 Subject: Comments on Section 2.6 of XML-Namespaces References: <005901bd5d6c$e8f651f0$7a0b4ccb@NT.JELLIFFE.COM.AU> Message-ID: <35224766.11CDA922@mecom.mixx.de> hello again; Rick Jelliffe wrote: > > However, as soon as you allow elements to be broken up into their > > individual attributes, this gaurantee goes away. Attribute "hijacking" > > makes it impossible to maintain the relationship between attributes of a > > single element, and impossible to maintain the relationship between the > > attributes and the child elements/content. > > I dont think I agree with your ideas of "hijacking". An attribute is > whatever > the designer has said it is, for better or worse. E.g., if a document type > designer says that all elements types will have an attribute which gives > the line number of the element type declaration in the original document, > then that attribute has nothing to do with the element type itself, and did you mean "element type name" here? if not, what is the "type" other than an artifact of the declaration ? > everything to do with the artifacts of the declaration of that type. Such > an attribute has its meaning without any reference to any particular > element type being defined. So some attributes are highly coupled to practically speaking, an attlist without a matching element declaration constitutes an implicit definition as soon as it permits an element instance to behave as if the attribute can be bound to it. > their type, some are highly uncoupled. although a form like '' is not precluded, i doubt if the effect is well defined... and '' was excluded. in which case the degree coupling between attribute and element, though it may be multiplied by additional attlist declaration, remains unchanged in the individual instance. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecom.mixx.de Wed Apr 1 16:08:59 1998 From: James.Anderson at mecom.mixx.de (james anderson) Date: Mon Jun 7 17:00:21 2004 Subject: Namespaces in XML: 3.1 the example [2] References: <352182A3.3CBE5701@mecom.mixx.de> <199804011212.HAA00271@unready.microstar.com> Message-ID: <35224A74.7357C0C@mecom.mixx.de> David Megginson wrote: > james anderson writes: > > ... > > > 2. if the schema is present, should the processor permit local > > additions to the namespace, that is the introduction of names which > > are not present in the external definition? should the processor > > permit redefinition of existing names from the namespace? > > This would go against the basic principle of namespaces (globalisation > and uniquification of names), since two documents could create > different extensions to the same namespace. ... > which is ok, if the issue is architectural forms, but bad if one is talking about namespaces... ? > I'm not certain that I understand the issue here -- why would someone > not bring additional element types in from a different namespace, > instead of adding private extensions to an existing one? > to "capture" an entity definition. > > (or rather, it's almost possible: there's a small problem, that the > > wd-standard precludes qualified entity names. why?) > > The namespace spec allows element type names, attribute names, and PI > targets to be associated with a URI. (External) entity and notation > names are already associated with a URI. but not identifiable. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rtennant at library.berkeley.edu Wed Apr 1 17:10:05 1998 From: rtennant at library.berkeley.edu (Roy Tennant) Date: Mon Jun 7 17:00:21 2004 Subject: Namespaces in XML: 3.1 the example [2] In-Reply-To: <5BF896CAFE8DD111812400805F1991F701C90F01@red-msg-08.dns.microsoft.com> Message-ID: Well then I guess I don't understand why this wouldn't be perfectly acceptable (is it?): If all "ns=" identifies is an authority that is not to be parsed, then why the URI? If it is to make sure that the authority is identified in a standard way, then http://www.loc.gov/ is more appropriate, *without* "schema/". Sorry to nitpick, but heck, what's the standardization process but a long series of group nitpicking sessions? Roy Tennant On Tue, 31 Mar 1998, Andrew Layman wrote: > There is not necessarily anything at the "other end" of > "http://books.org/schema/". This is simply a unique name, a URI, which > serves to distinguish any names associated with it from all other names on > the web. For example, to distinguish, in effect "title, as defined by > books.org" from "title, as defined by landuse.org." > > If there is a document defining the names, that would be referenced by the > 'src' attribute, as in prefix="B" src="http://books.org/schema/books/"?>. This document might be > human-readable text, it might be a DTD, it might be an XML-Data schema, or > it might be something else besides. A cheese sandwich, though? Probably > not. > > > -----Original Message----- > > From: Roy Tennant [SMTP:rtennant@library.berkeley.edu] > > Sent: Tuesday, March 31, 1998 4:13 PM > > To: xml-dev@ic.ac.uk > > Subject: Re: Namespaces in XML: 3.1 the example [2] > > > > Please forgive my ignorance here, but I've been trying to figure out what > > *exactly* is on the other end of this (for example): > > > > > > > > That is, what is to be found at http://books.org/schema/? A data schema > > marked up in XML-Data? A Web document? A cup of coffee and a cheese > > sandwich? Please, inquiring minds want to know...thanks, > > Roy Tennant > > > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > > (un)subscribe xml-dev > > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > > message; > > subscribe xml-dev-digest > > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Apr 1 17:32:52 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:21 2004 Subject: XML Spec error? Message-ID: <3.0.32.19980401072956.009fb930@pop.intergate.bc.ca> At 11:15 PM 3/31/98 -0800, Andrew n marshall wrote: > >In section 3.3: > >"For interoperability, writers of DTDs may choose to provide at most one >attribute-list declaration for a given element type, at most one attribute >definition for a given attribute name, and at least one attribute >definition in each attribute-list declaration." > ... >As it is currently, it seems to imply Read it more carefully. It says that while the following are legal in XML, they will be rejected by pre-TC SGML systems, so if you care about such systems, you shouldn't use them: Need another annotation here... -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Wed Apr 1 17:59:14 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:21 2004 Subject: Architectural Forms and Namespaces In-Reply-To: <35224A74.7357C0C@mecom.mixx.de> References: <352182A3.3CBE5701@mecom.mixx.de> <199804011212.HAA00271@unready.microstar.com> <35224A74.7357C0C@mecom.mixx.de> Message-ID: <199804011558.KAA00983@unready.microstar.com> james anderson writes: > > > 2. if the schema is present, should the processor permit local > > > additions to the namespace, that is the introduction of names which > > > are not present in the external definition? should the processor > > > permit redefinition of existing names from the namespace? > > > > This would go against the basic principle of namespaces (globalisation > > and uniquification of names), since two documents could create > > different extensions to the same namespace. ... > > which is ok, if the issue is architectural forms, but bad if one is talking > about namespaces... I'm not certain that I understand your point -- a document cannot invent new architectural forms for an existing base architecture. It is true that since architectural forms allow multi-generational inheritance, you can invent a new base architecture that is derived from an existing base architecture (just as you can derive a new class from an existing one in Java). All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msuzio at ford.com Wed Apr 1 19:07:32 1998 From: msuzio at ford.com (Michael J. Suzio) Date: Mon Jun 7 17:00:21 2004 Subject: Y2K and XML-DEV References: <3.0.1.16.19980401092217.390fe3a0@pop3.demon.co.uk> Message-ID: <199804011707.AA02322@mailfw2.ford.com> Quite good, Peter. You almost had me until you talked about not using "98". ;-) (Oh no, I said Jeh^H^H^H98!) -- Michael J. Suzio Web Technical Standards, WWW & Internet Applications (313) 24-88120 msuzio@eccms1.dearborn.ford.com / msuzio@ford.com (Public Email) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Wed Apr 1 19:25:35 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:00:21 2004 Subject: Proposal for src files Message-ID: <3.0.32.19980401111112.006d0948@swbell.net> At 09:22 AM 4/1/98, Peter Murray-Rust wrote: > >Is it possible to combine these two so that we express a DTD in a standard >XML notation? Many of us do this already, but I suspect that our tagset and >syntax vary. If we could agree on this - and I don't see this as >technically difficult - we could help both communities. It is not technically difficult--it is, however, practically impossible except in the most trivial way (a direct transliteration of DTD syntax) unless it is *explicitly* defined as a base architecture with very clear rules for specialization. And even then, developing that architecture will be difficult at best. The reason it's practically impossible is because getting agreement among a community of interest as wide and varried as the XML community on a subject of such importance as how to represent the definitions of document types is one of the hardest types of things there is to do. There are simply too many different ways to do it, too many different ways to represent things, too many interested parties. The degree of expressibility of schemas is open ended, meaning that any design, to be useful, must be maximally extensible. Defining extensible languages is hard. I personally think that trying to define a common markup approach to DTD representation is a waste of time: the answer is either obvious (Wayne Wohler did it over 5 years ago) or impossible to achieve consensus on. The first is not useful compared to the cost of defining and maintaining it, the second cannot be achieved by any sort of consensus-based approach. So there's no point in bothering. The minimum abstractions needed to define element types are already defined by the SGML property set--if your schema language can get you to these abstractions, fine. I say let groups define their own schema approaches without bothering to find too wide of a consensus. If one particular approach gains widespread acceptance, then fine. If it doesn't, we're no worse off than we were before, *but* we haven't wasted a huge amount of a scarse resource on a doomed effort. This provides opportunities for vendors to distinguish themselves by providing different types of validation and constraint support. As long as they always support normal DTD syntax, I see that as a good thing--if someone like Microsoft produces a product that helps me create better data repositories, then I'm happy to buy and use it, as long it accepts and generates normal DTDs with the level of fidelity I require. But why should I give Microsoft (or anyone else) free engineering support by being involved in a schema development effort? It doesn't make sense to me. If they want my help, they can pay me. I already have what I want and need and I'm capable of providing for myself if I need more (as is anyone with a copy of Lark and a Java book). As long as we have DTD syntax I can ignore or use any schema efforts as I please, because my ability to use normal DTDs is assured. The cost of writing schema-to-DTD-syntax transforms will always be lower than the cost of participating in wide-scope schema definition efforts. But maybe I'm just a crank. Cheers, Eliot --
W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From srn at techno.com Wed Apr 1 19:36:59 1998 From: srn at techno.com (Steven R. Newcomb) Date: Mon Jun 7 17:00:21 2004 Subject: why do namespaces have such a bad rep (was Re: Experimenting with Namespaces - DTDs? In-Reply-To: <3521BC55.6014F2E7@mecom.mixx.de> (message from james anderson on Wed, 01 Apr 1998 06:02:40 +0200) References: <199803301710.MAA01121@unready.microstar.com> <199803312127.QAA00299@unready.microstar.com> <199803312343.SAA01404@bruno.techno.com> <3521BC55.6014F2E7@mecom.mixx.de> Message-ID: <199804011625.LAA00802@bruno.techno.com> James Anderson writes: > why do namespaces have such a bad reputation? in particular, why are > they discredited for having been conflated with something which they > are not, and which they do not claim to be? I'm not making this up. These are the kinds of ideas that some very smart people brought home from the XML Conference in Seattle last week. I was called from the conference and told, with breathless excitement, that XML does inheritance via an XML facility called "namespaces". I don't know how it happened, and I'm not blaming anyone, but it's pretty clear to me that RDF was oversold there, as were namespaces. And it's easy to see how this misunderstanding would mushroom: when you use an element type name from another document type, and say so, and if you don't know the history of RDF, or the real significance of context in document architectures, or the stated limitations on the purpose of namespaces, wouldn't it follow (as a "least hypothesis", if nothing else) that you intend this element to be treated as if it were that other element type? (This is a mistake that I, for one, could easily make!) > why? what does colon syntax have to do with class inheritance? Expectations created by vague recollections of OO syntaxes that use colons to delimit class names. No more, no less. I'm not claiming that it's a logical or appropriate presumption. I'm just observing a fact. Real people -- people I respect -- are misunderstanding what's happening here, and in a big way. Moreover, it's a difficult misunderstanding to clear up, and clearing it up inevitably casts Fear, Uncertainty, and Doubt on the whole XML thing, which is something I really don't want to do. > the namespace 'thing' maps the names from the "inherited from (sic) > DTD" into a unique region of a two dimensional namespace. it says > nothing at the structure. Yes. I said that. > > RDF was looking for was a way to guarantee global uniqueness of > > element type names, and if we ever try to get anything more than that > > from namespaces, we are on very thin ice indeed. > agreed, but it doesn't claim to. That's right. However, the fact is, people need inheritance. The closest thing XML provides today is namespaces. The unsophisticated are confusing namespaces with inheritance. This is a Bad Thing for XML and W3C, because when their eyes are opened, these people will feel betrayed and their honeymoon with XML will suddenly be over. "You mean there's no XML way for me to say that I want to treat this element as if it were this other element type in this other document type? What kind of pseudo-object-oriented horseshit is this XML thing, anyway?" And when you consider that namespaces are a suboptimal approach to the problem RDF is designed to address -- the very problem that was presumably the excuse for railroading namespaces through the committee process without the due consideration that has been the hallmark of excellence in the rest of the XML process (including the RDF process, except for this namespace business) -- I'm afraid namespaces look very bad indeed. > the wd doesn't propose to collapse the architecture, it proposes to > map the names. Yes. I said that. > > Anyway, all is not lost. This namespace thing is a mistake that will > > necessarily be corrected, simply in order to support the needs of > > civilization in an XML-dominated world. The way toward a solution is > > already paved by an ISO standard (ISO/IEC 10744:1997 Annex A.3) that > > is being adjusted to accommodate the syntactic limitations of XML > > (i.e., its lack of #NOTATION attributes). It is implemented in the SP > > parser and in other software systems, and it is already being used in > > many industrial contexts. It's the right sort of answer, it's not > > going away, and its usage is accelerating rapidly; there was a > > manyfold increase in the number of papers reporting its use at > > SGML/XML 97. > it also has equivalent mechanisms to manage the same problem within a > one-dimensional namespace. > (i.e. the problem doesn't go away) Huh? What problem doesn't go away? > some may find the ability to rename an advantage, as it allows one > to alter the intended semantics. i wonder whether it as often leads > to confusion. where the issue is really name-uniqueness, namespaces > are a much more compact expression. there's no reason they couldn't > be integrated into sgml architectures - but for the deconstructivist > aims, they'd accomplish the same thing as the renaming attribute... Do you think that providing a mechanism for renaming things is all that inheritable architectures have to offer? If so, you'd better study this topic some more. That's just a housekeeping feature. > why wouldn't people just take them for what they are - orthogonal to > the issue of structure, and use them for what they can do? Because: * people need inheritance more urgently than they need namespaces; * if they had inheritance they wouldn't need namespaces (even your "greater compactness" argument will not stand up to scrutiny because the ISO inherited architectures syntax can make much more compact documents than the proposed namespace syntax ever could); * nothing else in XML, today, even looks like inheritance; and * there is nothing in XML, today, that does inheritance. However, if the W3C would simply acknowledge that an XML-ready mechanism and syntax for true architectural inheritance is already available in an existing ISO standard, this whole problem would vanish, and the XML honeymoon would not end with disappointment. Why don't they just do that? Or, alternatively, just use the ideas, change all the names, and maybe add some web-application-specific stuff, as was done with HyTime hyperlinking in X-Link? -Steve -- Steven R. Newcomb, President, TechnoTeacher, Inc. srn@techno.com http://www.techno.com ftp.techno.com voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137) fax +1 972 994 0087 (at ISOGEN: +1 214 953 3152) 3615 Tanner Lane Richardson, Texas 75082-2618 USA xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecom.mixx.de Wed Apr 1 19:55:36 1998 From: James.Anderson at mecom.mixx.de (james anderson) Date: Mon Jun 7 17:00:21 2004 Subject: Comments on Section 2.2 of XML-Namespaces References: <005901bd5d6c$e8f651f0$7a0b4ccb@NT.JELLIFFE.COM.AU> <35224766.11CDA922@mecom.mixx.de> Message-ID: <35227FCA.E15EDDA3@mecom.mixx.de> hello again, is the constraint on uniqueness in 2.2 intended to apply within a namespacepi or across the scope of the pi? one one hand, the former makes little sense, but, on the other hand, the latter is too strict. where you specify (2.2...) "Namespace Constraint: Unique Prefix A namespace prefix may not be declared more than once; i.e. there may not be two PrefixDefs which contain the same NCName string." you make it impossible to declare a namespace in an external subset which would be intended to be used from documents which do not always require that the external subset be read. in such cases (perhaps likely - for documentary reasons - in all cases), one would wish to place the ns-pi both in the external subset and in the instance document. the constraint needs to be relaxed to permit multiple occurrences of a NCName where the namespace names (and if present the schema name) agree. bye for now, xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Wed Apr 1 20:11:20 1998 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:00:21 2004 Subject: Y2K and XML-DEV Message-ID: <01bd5db2$ea6b3be0$61addccf@uspppBckman> You did get me!! I was about to shoot off a note about @^#&*(#@ bureaucrats, but luckily restrained myself!! Frank -----Original Message----- From: Michael J. Suzio To: Peter Murray-Rust Cc: xml-dev@ic.ac.uk Date: Wednesday, April 01, 1998 9:10 AM Subject: Re: Y2K and XML-DEV >Quite good, Peter. You almost had me until you talked about not >using "98". ;-) > >(Oh no, I said Jeh^H^H^H98!) > >-- >Michael J. Suzio >Web Technical Standards, WWW & Internet Applications >(313) 24-88120 >msuzio@eccms1.dearborn.ford.com / msuzio@ford.com (Public Email) > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Wed Apr 1 20:23:48 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:00:21 2004 Subject: Proposal for src files Message-ID: <3.0.32.19980401112817.006d7034@swbell.net> At 09:22 AM 4/1/98, Peter Murray-Rust wrote: > >Is it possible to combine these two so that we express a DTD in a standard >XML notation? Technical note: at the recent NCITS (nee ANSI) meeting, the U.S. delegation to WG4 developed a proposed ammendment to the SGML standard that codifies the idea of DOCTYPE declaration components that are not in DTD syntax. The idea is very simple: allow "parameter data entities", e.g.: %my-declarations; The only requirement is that the document allow omitted DTD declarations. If the parser understands the notation of the parameter data entity, it can use it to do SGML validation to the degree it can construct the prolog portion of the document grove from the entity. If it doesn't understand the notation then the normal implied declaration rules apply. Any validation services over and above SGML validation must be clearly labeled as such (i.e., consistent with SGML's current distinction between validation errors and application warnings). Because the external DTD subset is really a parameter entity, you can make the whole external subset a parameter data entity: ]> ... Note that the internal subset is unchanged: you must use normal DTD syntax within the DTD subset. If approved by ISO, this ammendment would make SGML's official policy on schemas "use any syntax you want, we don't care". In particular, we have no plans or desire to define alternative syntaxes for DTD representation within ISO 8879. This does not, of course, preclude separate schema standardization efforts within NCITS or ISO (or anywhere else). So far this proposal seems to be non-controversial. The --
W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Wed Apr 1 21:04:36 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:21 2004 Subject: XML Schemas (was: Re: Proposal for src files) In-Reply-To: <3.0.32.19980401111112.006d0948@swbell.net> References: <3.0.32.19980401111112.006d0948@swbell.net> Message-ID: <199804011903.OAA02431@unready.microstar.com> W. Eliot Kimber writes: > As long as we have DTD syntax I can ignore or use any schema > efforts as I please, because my ability to use normal DTDs is > assured. The cost of writing schema-to-DTD-syntax transforms will > always be lower than the cost of participating in wide-scope schema > definition efforts. > > But maybe I'm just a crank. Of course Eliot's a crank, but I agree with him entirely. In fact, I'd like to mention an even greater -- the cost of fiddling with the XML spec when there is now such an enormous amount of work going on based on the XML 1.0 recommendation. Like Eliot, I believe that DTDs should remain the basic interchange format for XML document schemas. If someone wants to give me a schema in XML-Data, or in MMLML (Megginson's Meta-Language Markup Language), or whatever, I have no objection, as long as I get the standard DTD as well. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Apr 1 21:22:36 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:21 2004 Subject: Proposal for src files Message-ID: <3.0.32.19980401112049.00a8a740@pop.intergate.bc.ca> At 11:22 AM 4/1/98 -0600, W. Eliot Kimber wrote: >The reason it's practically impossible is because getting agreement among a >community of interest as wide and varried as the XML community on a subject >of such importance as how to represent the definitions of document types is >one of the hardest types of things there is to do. There are simply too >many different ways to do it, too many different ways to represent things, >too many interested parties. The degree of expressibility of schemas is >open ended, meaning that any design, to be useful, must be maximally >extensible. Defining extensible languages is hard. For the record, I disagree with the first half of Eliot's thesis here. I think that there is an *excellent* chance of getting consortium and leading vendors to coalesce in support of a schema proposal which attains the notorious MPRDV (Minimum Progress Required to Declare Victory) level but does not rule out downstream extensibility. MPRDV components IMHO are 1. does what DTDs do in as intuitive as possible a way 2. uses XML syntax 3. is compatible with the RDF data model 4. does basic lexical data typing of character data and attribute values The other big thing you need, of course, is data types which are inheritable and abstracted, to facilitate schema engineering. It can be (and has been) argued that we have that already with architectural forms. I think there is a practical determinant as to whether we are doing OK: if we can build an MPRDV schema facility which rules out neither AFs nor RDF schema types (two very different approaches to related but not identical problems) then that's a step forward. On the other hand, when Eliot points out that doing this will be technically difficult, he is correct. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jeremy at omsys.com Wed Apr 1 22:23:05 1998 From: jeremy at omsys.com (Jeremy H. Griffith) Date: Mon Jun 7 17:00:21 2004 Subject: Y2K and XML-DEV In-Reply-To: <3.0.1.16.19980401092217.390fe3a0@pop3.demon.co.uk> References: <3.0.1.16.19980401092217.390fe3a0@pop3.demon.co.uk> Message-ID: <353da154.245938811@mail.together.net> On Wed, 01 Apr 1998 09:22:17, Peter Murray-Rust wrote: >Since the robots are unlikely to have sophisticated heuristics (they >probably use a regex of the form [^9]98) it would be helpful if you could >also avoid using the digits '98' in your text. [This is unlikely, but could >occur in examples.] At last! An official *reason* to ban Windows 98, Office 98, and other such MS abominations! --Jeremy H. Griffith http://www.omsys.com/jeremy/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecom.mixx.de Thu Apr 2 05:45:21 1998 From: James.Anderson at mecom.mixx.de (james anderson) Date: Mon Jun 7 17:00:21 2004 Subject: why do namespaces have such a bad rep [2] References: <199803301710.MAA01121@unready.microstar.com> <199803312127.QAA00299@unready.microstar.com> <199803312343.SAA01404@bruno.techno.com> <3521BC55.6014F2E7@mecom.mixx.de> <199804011625.LAA00802@bruno.techno.com> Message-ID: <35230A09.58739027@mecom.mixx.de> hello again; first, an aside for those who may wonder why i pursue this. my concern is that i do not wish to have to implement support for enabling architectures just in order to contend with name collisions. on principle it would be wrong. practically it would be a waste of time. perhaps i have had the rare fortune of having been spared some experience which would have led me to believe that a particular syntactic notation for a name (the ':' between two parts) necessarily imputes any o-o behaviour to an entity thereby denoted. still, i would ask that, in this forum, were the distinction has been made clear, we should strive for a state where it is well understood. where possible sources of confusion are explicated and resolved, and where they are not taken as a justification discredit the notation for not accomplishing something to which it makes no claim... i propose, as a general principle, that the two words "inheritance" and "namespace" not be used in the same sentence, unless the sense is that of a conjunction between two unrelated things. > James Anderson writes: > > > why? what does colon syntax have to do with class inheritance? > > Expectations created by vague recollections of OO syntaxes that use > colons to delimit class names. No more, no less. I'm not claiming > that it's a logical or appropriate presumption. I'm just observing a > fact. Real people -- people I respect -- are misunderstanding what's > happening here, and in a big way. Moreover, it's a difficult > misunderstanding to clear up, and clearing it up inevitably casts > Fear, Uncertainty, and Doubt on the whole XML thing, which is > something I really don't want to do. > > > the namespace 'thing' maps the names from the "inherited from (sic) > > DTD" into a unique region of a two dimensional namespace. it says > > nothing at the structure. > > Yes. I said that. it's the "inherited" part that's the problem. the dtd is not thereby "inherited". > > > > > RDF was looking for was a way to guarantee global uniqueness of > > > element type names, and if we ever try to get anything more than that > > > from namespaces, we are on very thin ice indeed. > > > agreed, but it doesn't claim to. > > That's right. However, the fact is, people need inheritance. The > closest thing XML provides today is namespaces. the issues are orthogonal. that is, it can't be close: it's off in another dimension. if the discussion is about inheritance per se, one need not introduce namespaces. take the reference literature on CLOS.steele manages dozens of pages respectively on the package system and the object system - without ever needing to mention one in the context of the other. keene's "programmer's guide to clos" mentions packages only to clarify that they have nothing specifically to do with objects. when the topic is inheritance, don't even bring namespaces up... there is, in general, much to be said for implementing unrelated things with separable mechanisms. > The unsophisticated > are confusing namespaces with inheritance. This is a Bad Thing for > XML and W3C, because when their eyes are opened, these people will the sooner the better > feel betrayed and their honeymoon with XML will suddenly be over. > "You mean there's no XML way for me to say that I want to treat this > element as if it were this other element type in this other document > type? What kind of pseudo-object-oriented horseshit is this XML > thing, anyway?" And when you consider that namespaces are a > suboptimal approach to the problem RDF is designed to address -- the i don't know the "RDF history". i'm just reading the proposal and considering the implications for implementation and application. i use namespaces. i have implemented namespaces. i could have implemented attribute mapping. it would not have resolved all of the things i would expect to need wrt element inheritance. further, while attribute inheritance is supported, and the subtyping wrt validation of references is resolved, the mechanisms for structural inheritance avoid most of the difficult issues by specifying that the base architectures have nothing to do with each other. in the situations i'm trying to model that would lead either to repeating content definitions (which is counter-inheritance) or introducing artifactual structure. it would been much more complex and, as i understand (ISO/IEC 10744:1997 Annex A.3), it would not have accomplished everything i would have implemented it for. (WD-xml-names-19980327 doesn't either, but that's an unrelated issue - and at least i can rightly argue that, wrt namespaces, it should...) > ... > > it also has equivalent mechanisms to manage the same problem within a > > one-dimensional namespace. > > (i.e. the problem doesn't go away) > > Huh? What problem doesn't go away? name collisions. (please see below) > > some may find the ability to rename an advantage, as it allows one > > to alter the intended semantics. i wonder whether it as often leads > > to confusion. where the issue is really name-uniqueness, namespaces > > are a much more compact expression. there's no reason they couldn't > > be integrated into sgml architectures - but for the deconstructivist > > aims, they'd accomplish the same thing as the renaming attribute... > > Do you think that providing a mechanism for renaming things is all > that inheritable architectures have to offer? neither do i believe that, nor did i say so. the discussion was directed solely at the facility for renaming and made no reference to the remaining scope of ISO/IEC 10744:1997. > If so, you'd better > study this topic some more. you're right there. one thing i still need to understand is how i would, for example, define things like the ArcCFC entities for multiple inherited architectures (ISO/IEC 10744 p230), or handle possible name ambiguity between multiple architecures wrt entities used to specify section inclusion (ISO/IEC 10744 p223) (yes, i lack experience there. i woud presume one helps oneself to dotted names, but that's just conjecture) > That's just a housekeeping feature. > > > why wouldn't people just take them for what they are - orthogonal to > > the issue of structure, and use them for what they can do? > > Because: > > * people need inheritance more urgently than they need namespaces; that's an orthogonal issue. (see above) > * if they had inheritance they wouldn't need namespaces they are orthogonal issues. (see above) that (ISO/IEC 10744:1997 Annex A.3) contains specialized provisions to address a subset of naming issues should not lead one to claim that another aspect of the standard - inheritance, in itself, provides a solution for name ambiguity. > (even your > "greater compactness" argument will not stand up to scrutiny because > the ISO inherited architectures syntax can make much more compact > documents than the proposed namespace syntax ever could); > > * nothing else in XML, today, even looks like inheritance; and that (Prefix ':')? LocalPart "even looks like inheritance" is a misconception. > * there is nothing in XML, today, that does inheritance. that's an orthogonal issue. > However, if the W3C would simply acknowledge that an XML-ready > mechanism and syntax for true architectural inheritance is already > available in an existing ISO standard, this whole problem would > vanish, the problem does not "vanish", (some of) it (please see above) has just been worked out another way. ... xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Per-Ake.Ling at uab.ericsson.se Thu Apr 2 09:04:49 1998 From: Per-Ake.Ling at uab.ericsson.se (Per-Ake Ling) Date: Mon Jun 7 17:00:21 2004 Subject: Y2K and XML-DEV Message-ID: <199804020703.JAA28166@uabs19c27.eua.ericsson.se> > From: Peter Murray-Rust > Subject: Y2K and XML-DEV > > Residents of the UK will know that the PrimeMinister last week announced > the recruitment of 20,000 people to solve the 'millennium bug' crisis in > the UK. As I am sure all of you know this bug involves software which ...[snip] In Sweden another solution was presented on the evening news yesterday. There is a proposition from the goverment to change our calendar from Gregorion to Julian (the one used in Russia in 1917), which would buy two weeks of time. An executive of the swedish equivalent of the IRS was very positive, and a couple of party-leaders were interviewed, stating that it seems like a good idea and they support it, one of them also stating that the possibility of using alternative calendars that give even more leeway should be explored. Per-Åke -- Per-Åke Ling (note: Per-Åke, transliteration Per-Ake) email: Per-Ake.Ling@uab.ericsson.se phone: +46 8 727 5674 Ericsson Utvecklings AB mobile: +46 70 790 2446 AXE Research and Development fax: +46 8 727 3463 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mcgredo at stl.nps.navy.mil Thu Apr 2 09:19:08 1998 From: mcgredo at stl.nps.navy.mil (Don McGregor) Date: Mon Jun 7 17:00:22 2004 Subject: Y2K and XML-DEV In-Reply-To: <199804020703.JAA28166@uabs19c27.eua.ericsson.se> from "Per-Ake Ling" at Apr 2, 98 09:03:09 am Message-ID: <199804020719.XAA09965@pinafore.stl.nps.navy.mil> > In Sweden another solution was presented on the evening news yesterday. > There is a proposition from the goverment to change our calendar from > Gregorion to Julian (the one used in Russia in 1917), which would buy > two weeks of time. As suggested on USENET today, we could simply go to using hex numbering. Which would give us 199A, 199B, 199C, 199D, 199E, 199F before it rolls over to 2000. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From antony at n-space.com.au Thu Apr 2 09:33:44 1998 From: antony at n-space.com.au (Antony Blakey) Date: Mon Jun 7 17:00:22 2004 Subject: Y2K and XML-DEV References: <199804020719.XAA09965@pinafore.stl.nps.navy.mil> Message-ID: <352F19E8.DC02849F@n-space.com.au> Don McGregor wrote: > As suggested on USENET today, we could simply go to using hex > numbering. Which would give us 199A, 199B, 199C, 199D, 199E, > 199F before it rolls over to 2000. Of course you meant 199E, 199F, 19A0, 19A1... +----------------------------------+ | Antony Blakey | | N-Space Pty Ltd | | Java - CORBA - SGML - XML | | mailto:antony@n-space.com.au | | http://www.n-space.com.au | +----------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Thomas_Redelberger at exchange.de Thu Apr 2 09:54:02 1998 From: Thomas_Redelberger at exchange.de (Thomas Redelberger) Date: Mon Jun 7 17:00:22 2004 Subject: Y2K and XML-DEV Message-ID: <412565DA.00307E85.00@NTBP4K2.deutsche-boerse.de> > 199F before it rolls over to 2000. Ha, 199F gets to 19A0. Even more time to 2000 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu Apr 2 10:51:15 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:22 2004 Subject: Proposal for src files In-Reply-To: <3.0.32.19980401111112.006d0948@swbell.net> Message-ID: <3.0.1.16.19980402085119.44b7a97e@pop3.demon.co.uk> At 11:22 01/04/98 -0600, W. Eliot Kimber wrote: >But maybe I'm just a crank. No. But you have a level of vision and understanding that makes it difficult for others like me to follow. This is a recurring theme in the whole of current IT/CS - there are 'right' solutions that people simply are not able to comprehend or find too difficult to adopt. In those cases one ends up with a small number of people who provide a solution (often at high cost) to a large number of people who don't understand an don't own it. IMO the single most important thing about XML is that it makes things accessible to at least a hundred times more people than other technologies. We are seeing this debate frequently now - 'what does XML do that XYZ doesn't?' My answer is that it can relate to ordinary hackers - and possibly even to inspired management. >At 09:22 AM 4/1/98, Peter Murray-Rust wrote: >> >>Is it possible to combine these two so that we express a DTD in a standard >>XML notation? Many of us do this already, but I suspect that our tagset and >>syntax vary. If we could agree on this - and I don't see this as >>technically difficult - we could help both communities. > >It is not technically difficult--it is, however, practically impossible >except in the most trivial way (a direct transliteration of DTD syntax) I was thinking of something very trivial (you didn't expect anything else from me, surely :-) - a lossless translation of DTD to XML format (and vice versa) without any inheritance, mapping, etc. I thought this was uncontroversial - but maybe I haven't got my point over. >unless it is *explicitly* defined as a base architecture with very clear >rules for specialization. And even then, developing that architecture will >be difficult at best. > >The reason it's practically impossible is because getting agreement among a >community of interest as wide and varried as the XML community on a subject >of such importance as how to represent the definitions of document types is >one of the hardest types of things there is to do. There are simply too I wasn't trying to tackle this. We already *have* a definition of document types - it's XML 1.0. I was simply suggesting we standardised an XML-based syntactic representation of this. >many different ways to do it, too many different ways to represent things, >too many interested parties. The degree of expressibility of schemas is >open ended, meaning that any design, to be useful, must be maximally >extensible. Defining extensible languages is hard. > >I personally think that trying to define a common markup approach to DTD >representation is a waste of time: the answer is either obvious (Wayne >Wohler did it over 5 years ago) or impossible to achieve consensus on. The If it's obvious and already done, perhaps it should be re-used? >first is not useful compared to the cost of defining and maintaining it, >the second cannot be achieved by any sort of consensus-based approach. So >there's no point in bothering. > >The minimum abstractions needed to define element types are already defined >by the SGML property set--if your schema language can get you to these >abstractions, fine. Perhaps all we need is a representation of the property set in XML format. Would *that* be controversial? > >I say let groups define their own schema approaches without bothering to >find too wide of a consensus. If one particular approach gains widespread >acceptance, then fine. If it doesn't, we're no worse off than we were >before, *but* we haven't wasted a huge amount of a scarse resource on a >doomed effort. This provides opportunities for vendors to distinguish Although I value your judgement, I don't see why this is 'doomed'. The same could have been said about SAX. I'm proposing that we take simple steps to see if there is a communality here that we can use. I am not enthralled by suggested that we let anyone do whatever they like. We can then guarantee that whenever we encounter a foreign schema it will be another significant task to understand and code it. This would - as I suggested - simple move the 'battleground' from tags to schemas. We have avoided having parser wars with SAX and DOM - couldn't we at least look at schemas? I agree resource is scarce. I think SAX showed an excellent way of conserving such resource. If we followed the same process we might find out at an early stage whether we were doomed or not :-) >themselves by providing different types of validation and constraint >support. As long as they always support normal DTD syntax, I see that as a >good thing--if someone like Microsoft produces a product that helps me >create better data repositories, then I'm happy to buy and use it, as long >it accepts and generates normal DTDs with the level of fidelity I require. >But why should I give Microsoft (or anyone else) free engineering support >by being involved in a schema development effort? It doesn't make sense to We gave them (and lots of others) free engineering support for SAX. I put a lot of effort into that and I'm not complaining :-) Actually even just for me, the effort was less than writing APIs for every parser. Multiply that by 10000... >me. If they want my help, they can pay me. I already have what I want and >need and I'm capable of providing for myself if I need more (as is anyone >with a copy of Lark and a Java book). I'm afraid I (and I think many others) aren't :-) and that is why I raised the problem. Being a part-time academic who does XML in their spare time I don't have the resources of a commercial company - and I suspect there are many others in a 'similar' position. They will come across 'schemas', 'src' files, etc. and need to know what to do with them. I hope we have a more constructive message than simply "wait and see what the large commercial organisations do, and then buy their products" :-). The WWW grew in large part through the efforts of large numbers of large number of people who picked up a common philosophy and developed it. If the message now is that "unless you are a member of W3C you shouldn't be involved in XML development" it represents the passing of an era, and I'll need to rethink. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Thu Apr 2 12:01:22 1998 From: ht at cogsci.ed.ac.uk (Henry Thompson) Date: Mon Jun 7 17:00:22 2004 Subject: Re-announcement of XED: An XML document instance editor Message-ID: <1064.199804021001@naomi.cogsci.ed.ac.uk> As shown at XML Developers' Day, and plugged by Tim Bray (thanks Tim, you cheque is in the post :-), a reminder that the latest alpha release (we're up to 0.2.1.4) of XED is available for evaluation: http://www.ltg.ed.ac.uk/~ht/xed.html >From the CHANGES file: Changes from v0.2.0 as released: Bug fixes: undo of attr operations fixed ]]a> followed by deleting the 'a' now impossible Resizing works properly window width and height fit better to available real-estate rule out control characters in CDATA allow : in names move menus out of the way more move help menu down fully onto screen forward/backward word at E/BOL works properly Features: Modest support for progressive (depth-sensitive) indentation Catch most forms of window death and give you a save changes dialog http://... will work on command line Right button opens tag or attr for editing (thanks to Mike Kay for suggesting this) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mtbryan at sgml.u-net.com Thu Apr 2 15:45:23 1998 From: mtbryan at sgml.u-net.com (Martin Bryan) Date: Mon Jun 7 17:00:22 2004 Subject: Y2K and XML-DEV Message-ID: <01bd5e0c$26695e20$LocalHost@sgml.u-net.com> Per-Ake wrote: >In Sweden another solution was presented on the evening news yesterday. >There is a proposition from the goverment to change our calendar from >Gregorion to Julian (the one used in Russia in 1917), which would buy >two weeks of time. An executive of the swedish equivalent of the IRS was >very positive, and a couple of party-leaders were interviewed, stating >that it seems like a good idea and they support it, one of them also >stating that the possibility of using alternative calendars that give >even more leeway should be explored. That proposal definitely gets my vote. But what about going back to the older Hebraic calendar, or better still adopting the world's most widely used calendar (until recently), the Chinese one? Then we have 500 years or so to plan! (That might, just, be enough to allay the fears of the politicians.) Incidentally my kid was so scared by the news broadcasts about the millenium bug that he stopped using his computer in case it would blow up in his face. To cure this I simply set the clock to 23:55 on Dec 31st 1999 and stood in front of the machine until 00:01 2000 was shown on the clock. I then ran all his programs without problems, including Word, which correctly dated his first document for the new millenium. And all this on a 5 year old machine running Windows 3.1. Sorry I don't believe all the hype (but do understand where the problems will occur - how many of you will need to buy a new digital watch next year?) Martin Bryan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wperry at fiduciary.com Thu Apr 2 16:01:56 1998 From: wperry at fiduciary.com (W. E. Perry) Date: Mon Jun 7 17:00:22 2004 Subject: True XML compliance (Re: Sean McGrath's posts to XML-L) Message-ID: <352399A2.2776DBE0@fiduciary.com> [Sean Mc Grath] Thu, 2 Apr 1998 12:06:24 +0100 General discussion of Extensible Markup Language >>Binary formats and APIs *constrain* the uses to which data can be put. They make information >>"application owned" rather that "creator owned". >>Binary formats are hunky dory for internal application use and I am all for it as long as the >>application provides a XML data representation that I can get at. Once I have this I *know* that >>I am not constrained by any vendor in terms of what I can do with my data. >>With XML I know that I can rapidly develop applications without dusting off yet another badly >>documented, buggy API and spending weeks just getting my head around the data structures. >>XML raises the base level of data interchange above plain text by allowing markup intermingled >>with the text to describe complex structures. ALLELUIA! This is exactly what XML compliance should mean. Application vendors claiming XML compliance should be obliged to provide import from/export to functions and to provide (or even better, to generate ad hoc) an accurate DTD describing the schema. Binary file developers would provide utilities to do the same thing. Does anyone suppose that this is what Microsoft means by its 'commitment' to XML? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Thu Apr 2 17:39:59 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:00:22 2004 Subject: Proposal for src files Message-ID: <3.0.32.19980402093550.006950f4@swbell.net> At 08:51 AM 4/2/98, Peter Murray-Rust wrote: >At 11:22 01/04/98 -0600, W. Eliot Kimber wrote: > >>But maybe I'm just a crank. > >No. But you have a level of vision and understanding that makes it >difficult for others like me to follow. This is a recurring theme in the >whole of current IT/CS - there are 'right' solutions that people simply >are not able to comprehend or find too difficult to adopt. In those cases >one ends up with a small number of people who provide a solution (often at >high cost) to a large number of people who don't understand an don't own >it. IMO the single most important thing about XML is that it makes things >accessible to at least a hundred times more people than other technologies. >We are seeing this debate frequently now - 'what does XML do that XYZ >doesn't?' My answer is that it can relate to ordinary hackers - and >possibly even to inspired management. I certainly appreciate Peter's comments here--he's 100% correct. The distinction I try to make is this: a relatively small number of enterprises have serious problems that can only be solved by things like full SGML, the whole of the HyTime standard, the DSSSL transformation language, and other "big" solutions to hard problems. That's the environment I like to work in. But not everyone needs this level of solution. Thus XML and its related specifications. At the same time, I know that most enterprises' challenges are more involved then they may think or may want to admit, which means that in many cases, the simpler solutions provided by XML will not be sufficient, which means that they'll need to start using some parts of the more complete solutions--not all of them, just those parts they need. The challenge is to bridge the conceptual gap between the entry level of XML and the full solution provided by the big standards. I certainly agree that saying "just read ISO/IEC 10744" is not a productive way to do it. But the problems are real, solutions have been thought out and, to some degree, implemented already, so there is value there once you're ready to go get it. But until you're ready, it's just noise. I appreciate that. I see my job as helping people to bridge that gap. I think the real problem is that complex information management challenges are complex--there are no simple solutions. Complexity conserves. The best you can do is concentrate the complexity so that it can be hidden from those who are not responsible for managing it. But the complexity has to go somewhere. Look at automobiles. Twenty years ago, anyone who could read could fix their new car--all it took was a shop manual, a few tools, and some patience. Cars were simple, but they failed to address a number of problems, like fuel efficiency, safety, emissions. They needed to be tuned every few thousand miles. Today, you cannot do more than change your oil unless you are a factory-trained mechanic with lots of expensive tools. But cars are much more fuel efficient, safe, and clean. They can go 50 or 100 thousand miles without a tune up. The complexity has been concentrated where before it was distributed. There's a place for both. I have a car I can fix and car I can't. I use HTML for quick fun and full SGML for more involved fun. I wouldn't drive across the country in my MGB and I wouldn't entrust my business-critical data to HTML. >>except in the most trivial way (a direct transliteration of DTD syntax) > >I was thinking of something very trivial (you didn't expect anything else >from me, surely :-) - a lossless translation of DTD to XML format (and vice >versa) without any inheritance, mapping, etc. I thought this was >uncontroversial - but maybe I haven't got my point over. >>unless it is *explicitly* defined as a base architecture with very clear >>rules for specialization. And even then, developing that architecture will >>be difficult at best. >> >>The reason it's practically impossible is because getting agreement among a >>community of interest as wide and varried as the XML community on a subject >>of such importance as how to represent the definitions of document types is >>one of the hardest types of things there is to do. There are simply too > >I wasn't trying to tackle this. We already *have* a definition of document >types - it's XML 1.0. I was simply suggesting we standardised an XML-based >syntactic representation of this. Ah. My point is that, if that's all it is, why bother? If you can parse DTDs sufficiently to generate XML versions, why bother generating the XML version? You've already parsed the thing the data's already yours to manipulate. If you do want the XML version, then go back into the c.t.s archives and find Wayne Wohler's posting (I'm sure he posted something about it). In any case, it's a pretty obvious design effort and the result should be uncontroversial except for some arbitrary design choices (use attributes? etc.). >>The minimum abstractions needed to define element types are already defined >>by the SGML property set--if your schema language can get you to these >>abstractions, fine. > >Perhaps all we need is a representation of the property set in XML format. >Would *that* be controversial? I've created it. Check out "http://www.hytime.org/materials/hi2pssgm.xml" (created from the SGML property set definition document published in ISO/IEC 10744 using SX). I believe there is work to define an XML-specific version of the property set that is organized and presented in a way that is more appropriate for the XML audience (I would not, by any stretch, pretend that the property set document is an accessible document as currently formulated--it's very useful as a formal specification, not too useful as a tutorial introduction to the SGML data abstraction.). However, a tree view of the document goes a long way toward making it easier to work with. I'm developing a tool that will, among other things, provide navigable tree views of property set documents. Watch this space. I also appreciate Tim's disagreement with my position. It is a subject on which reasonable people can disagree [I'm reminded here of the excellent childrens' book *The Phantom Tollbooth*, where the main character does the impossible because no-one told him he couldn't do it.]. If the effort succeeds, that's good. But my experience suggests that it's going to be a lot harder than it might look at first. I also feel that it's too early in the day to start standardizing--we need more experience. There's a lot of good thinking on schemas in general that needs to be examined and synthesized. Cheers, E. --
W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rtennant at library.berkeley.edu Thu Apr 2 17:46:25 1998 From: rtennant at library.berkeley.edu (Roy Tennant) Date: Mon Jun 7 17:00:22 2004 Subject: Proposal for src files In-Reply-To: <3.0.1.16.19980402085119.44b7a97e@pop3.demon.co.uk> Message-ID: On Thu, 2 Apr 1998, Peter Murray-Rust wrote: > it. IMO the single most important thing about XML is that it makes things > accessible to at least a hundred times more people than other technologies. I couldn't agree more. Why do you think the Web was so widely and swiftly adopted? Because you could such neat things with it *and the price of admission was low*. That is, HTML is definitely not rocket science. Let's try to see if we can keep XML from being rocket science. For example, why can't we use XML Data for our DTDs? Why can't we use it to describe resources instead of the "labeled directed graphs" (say what?) of RDF? Sometimes the best solution is NOT the most thorough or general one, but the simplest. Roy Tennant xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Thu Apr 2 18:06:07 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:00:22 2004 Subject: Proposal for src files Message-ID: <3.0.32.19980402100252.00ebe6ec@swbell.net> At 07:45 AM 4/2/98 -0800, Roy Tennant wrote: >Sometimes the best solution is NOT the most thorough or general one, but >the simplest. The problem is that the definition of "best" changes with the use scenario. Cheers, Eliot --
W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lisarein at finetuning.com Thu Apr 2 18:36:33 1998 From: lisarein at finetuning.com (Lisa Rein) Date: Mon Jun 7 17:00:22 2004 Subject: the death of the black box References: <3.0.32.19980402100252.00ebe6ec@swbell.net> Message-ID: <3523C374.8EE1FDE3@finetuning.com> Wow I thought I was the only one the used the Phantom Tollbooth as a bible! My concern is that the eventual realization of this "big picture" might get crippled if we allow this fear of it to continue to be instilled into the masses. People want XML functionality, and they want it now -- THEY SAY. But when you start really talking to these businesses, they don't even realize what it has to potential to accomplish. So let's slow down a bit so we don't have anymore disasters like it appears this Namespace thing resulted in, and instead, perhaps we should question the overall viability of technologies, such as RDF, or anything else who appear to be dysfunctional until "things that will be defined in another document" miraculously appear. If we're designing an architecture, it would be pretty silly to build the house on top of screwjacks because somebody's got the nicest rug they want to put in, yes? It seems to me that's where were headed. I thought the explanation Eliot gave the other day was very straight forward and enlightening -- and that says a lot about how fundamental some of these issues are -- because let's face it guys --compared to most of the people in this list, I'm a beginner big-time! Yet recently I've been able to enable all sorts of different XML functionalities at many levels -- and I'm not even getting good at Java yet! I'll admit that I've had to rely on the advice of a comrade or two to get the job done....in fact, alright, I've always had to rely on the advice of a comrade or two -- but only at the mailing-list intensity level that we are conversing right here -- for maybe ten or fifteen minutes -- that's doesn't even count as an obstruction in my book. My point is exactly what Eliot always says -- A lot of this is *NOT* rocket science -- as many would have people believe. If it's ooooh soo complicated, then scardie-cat developers will have to buy a black box to do everything for them. If the world were to discover just how basic some of this stuff is -- they might never buy a black box again! And would that really be so bad? :-) lisa W. Eliot Kimber wrote: > > At 07:45 AM 4/2/98 -0800, Roy Tennant wrote: > >Sometimes the best solution is NOT the most thorough or general one, but > >the simplest. > > The problem is that the definition of "best" changes with the use scenario. > > Cheers, > > Eliot > -- >
> W. Eliot Kimber, Senior Consulting SGML Engineer > Highland Consulting, a division of ISOGEN International Corp. > 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 > www.isogen.com >
> > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthew at praxis.cz Thu Apr 2 19:31:59 1998 From: matthew at praxis.cz (Matthew Gertner) Date: Mon Jun 7 17:00:22 2004 Subject: Proposal for src files Message-ID: <01bd5e5c$dcefd7f0$020b0ac0@xerius> Eliot, You make a lot of valid points, but I can't held feeling that your stance is a bit radical. The fact of the matter is, I asked myself the other day exactly how HyTime handles inheritance and spent at least an hour trying to find the answer. Of course, I wasn't entirely sure that I understood at the time and I certainly couldn't explain it to you now without looking it up again. And this despite a pretty solid grip on SGML and OO technologies, as well as a thick stack of SGML-related bookmarks in my browser. HyTime is an amazing piece of work, but it is never going to attain critical mass in its current form. As you say in your message, this is just fine: it is a big tool for big problems and requires specialist intervention. On the other hand, seen it this light it clearly isn't a general answer to the problem of mixing and matching XML DTDs. XLL is a great example of the right approach to take in regard to using DTDs as extensible schemata. First of all, it reduces the intimidation factor inherent in the 500 odd pages of dense text that make up the HyTime standard, by extracting only the relevant bits. Moreover, it cuts out some of the complexity that makes HyTime somewhat unapproachable for many people. Nevertheless, it steals many of the good ideas first advanced by HyTime (and TEI linking); I don't think that anyone is suggesting that this work go to waste. I really like Peter's idea of trying to use the resources of XML-DEV to produce a concrete proposal for extensible XML DTDs. Let's start throwing out some ideas and see where this gets us! Having a HyTime expert (and I see several) onboard would obviously be invaluable. Many, many years have gone into research and practical work on object-oriented design, much of which was focused directly on the problems inherent in producing extensible schemata. My feeling is that this work could be mapped almost directly onto XML. There was a question recently on the list about sticking an arbitrary element type into the content model of an element type in an existing DTD. The fact of the matter is, this simply doesn't work. On the other hand, DTDs can be designed for extensibility, and derived DTDs can then include digital signatures and the like by extending the content model. This seems to be the idea behind XML-Schema, which would be an excellent starting point for this kind of effort, IMHO. More later... >Ah. My point is that, if that's all it is, why bother? If you can parse >DTDs sufficiently to generate XML versions, why bother generating the XML >version? You've already parsed the thing the data's already yours to >manipulate. If you do want the XML version, then go back into the c.t.s >archives and find Wayne Wohler's posting (I'm sure he posted something >about it). In any case, it's a pretty obvious design effort and the result >should be uncontroversial except for some arbitrary design choices (use >attributes? etc.). Because there are significant advantages to having a single syntax for both data and metadata. What about the standard DTD syntax makes it _the_ syntax for schema exchange for now and all eternity? I'd love to pull my schema definition into my XML browser, print it with an XSL stylesheet, etc., etc. A standard XML-based schema language would let me do this without having to go through the effort of maintaining an alternate syntax and conversion tools. What I am sensing is that, if there is going to be a standard syntax for schema exchange, people are not convinced that the existing syntax is the right one. I'm a bit confused by your argument, to the extent that you say, on the one hand that the effort is trivial, and on the other that it should not be underestimated. I haven't seen the work by Wayne Wohler, but if I may ask a naive question: isn't this what XML-Schema is all about? Regards, Matthew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Thu Apr 2 19:42:13 1998 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:00:22 2004 Subject: True XML compliance (Re: Sean McGrath's posts to XML-L) Message-ID: <01bd5e69$7852a0e0$5faedccf@uspppBckman> >>This is exactly what XML compliance should mean. Application vendors >>claiming XML compliance should be obliged to provide import from/export >>to functions and to provide (or even better, to generate ad hoc) an >>accurate DTD describing the schema. Binary file developers would provide >>utilities to do the same thing. It should be easy to generate a DTD from a well formed document. Does any one know of any software out there that does this? Or is there even a need for this? If the answer to the first is no and the answer to the second is Yes then I can easily spruce up a small app that I have developed for my own use and make it publicly available. Frank xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Thu Apr 2 20:01:57 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:00:22 2004 Subject: Proposal for src files Message-ID: <3.0.32.19980402115129.00774800@swbell.net> At 07:29 PM 4/2/98 +0200, Matthew Gertner wrote: >Eliot, > >You make a lot of valid points, but I can't held feeling that your stance is >a bit radical. The fact of the matter is, I asked myself the other day >exactly how HyTime handles inheritance and spent at least an hour trying to >find the answer. Of course, I wasn't entirely sure that I understood at the >time and I certainly couldn't explain it to you now without looking it up >again. And this despite a pretty solid grip on SGML and OO technologies, as >well as a thick stack of SGML-related bookmarks in my browser. That's probably because the architecture facility of ISO/IEC 10744 doesn't *do* inheritance in the way that most people seem to expect. It lets you define hierarchies of semantic derivation, that is "this element is derived from type X defined by architecture Y". It provides a way to validate elements derived from type X against the DTD rules defined by architecture Y. But that's it. It's completely passive and declarative. Architectures *enable* the combination of elements derived from various sources, but the act of combining is done by the human that defines the resulting document. There is no automatic process, nor, as far as I can see, can there be in any generally useful and non-trivial way. It is fruitless to try to draw too many analogies between SGML architectures and object-oriented programming because they are different kinds of things. Architectures are purely about data definition, OO is about active programming. I'm not sure this makes the issue any clearer, except to say that you cannot understand architectures in terms of object-oriented programming. Architectures are *much* simpler than that: it's purely a mapping from elements in a document to element types defined in a separate specification. *All* the machinery defined in the AFDR annex is about enabling SGML validation of the result of resolving the mapping and providing syntax shortcuts for defining the mapping. But at its heart, it's just a simple syntactic mechanism for mapping from instances to schemas (architecture definitions). Nothing more, nothing less. But even this simple facility has profound implications and significant potential benefit in terms of making document type definitions managable, scalable, and re-usable. Any work done at the schema definition level (that is, the non-DTD syntax used to define schemas) is gravy. If these extensions include more truly object-oriented stuff, so much the better (I think--I'm actually highly skeptical of the true benefit of object-orientation). Cheers, E. --
W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wperry at fiduciary.com Thu Apr 2 20:31:20 1998 From: wperry at fiduciary.com (W. E. Perry) Date: Mon Jun 7 17:00:22 2004 Subject: True XML compliance (Re: Sean McGrath's posts to XML-L) References: <01bd5e69$7852a0e0$5faedccf@uspppBckman> Message-ID: <3523D894.62A3EAE9@fiduciary.com> Frank Boumphrey wrote: > It should be easy to generate a DTD from a well formed document. Does any > one know of any software out there that does this? Or is there even a need > for this? > > If the answer to the first is no and the answer to the second is Yes then I > can easily spruce up a small app that I have developed for my own use and > make it publicly available. > > Frank With respect, my interest (and, I believe, Mr. McGrath's) was not in generating DTD's from documents but in defining XML compliance for applications as the ability to accept an XML document plus DTD as input or to generate XML plus DTD as output. This would mean that, e.g., MS Access would have a File/Import menu option to open and parse a specified XML document plus DTD and from that input to create new records in existing tables, create new tables, create and apply new indices, construct new data views. . ., in short, all of the possibilities of SQL, and then some, without the dialect traps. This is (or is a big part of) what I would hope that Microsoft's commitment to XML in Office would mean. Or, to build on Mr. McGrath's example, graphics editor/publisher program XYZ, while storing data in its own proprietary binary format, would have, as part of claiming XML compliance, the option to generate as output from one of its own files an XML document and corresponding document-specific DTD. At this level of compliance, XML not only supplants IIOP and COM as 'wire protocols' in heterogeneous distributed environments, but obviates the binary vs. text dispute which we have seen in some recent threads. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bdister at macromedia.com Thu Apr 2 20:51:23 1998 From: bdister at macromedia.com (Brian Dister) Date: Mon Jun 7 17:00:22 2004 Subject: Problem with xmldso.java compiling MSXMLJavaParser 1.8 Message-ID: <3.0.3.32.19970402105302.006b176c@rwspo.macromedia.com> Newbie question. I cannot get the msxml java parser 1.8 to build in VJ++ 1.1. I upgraded to the 1.02.7315 compiler, but then get problems compiling xmldso.java as described below previously and included at the bottom. Anyone know the solution? <<<<<< Even worse, I can't re-compile MSXML in J++. I stuck in the first file -- com.ms.xml.dso.XMLDSO.java. The error messages are all in the same type: Value for argument 'parent' cannot be converted from 'int' in call to 'Element ElementFactory.createElement(Element parent, int type, Name tag, String text)' The statement in XMLDSO.java is: e = factory.createElement(Element.ELEMENT, XMLRowsetProvider.nameROWSET); It doesn't look right but I don't know how MS can make *.class from it (or I missed something?). Has anyone tried to recompile it and succeeded. >>>> --------------------Configuration: XMLJavaParser - Java Virtual Machine Release-------------------- Compiling... Microsoft (R) Visual J++ Compiler Version 1.02.7315 Copyright (C) Microsoft Corp 1996-1997. All rights reserved. classes\com\ms\xml\parser\Parser.java(612,20) : warning J5014: 'boolean isSpace(char)' has been deprecated by the author of 'java.lang.Character' classes\com\ms\xml\parser\Parser.java(188,41) : warning J5014: 'PrintStream(OutputStream)' has been deprecated by the author of 'java.io.PrintStream' classes\com\ms\xml\dso\XMLDSO.java(638,30) : warning J5014: 'Dimension size()' has been deprecated by the author of 'java.awt.Component' classes\com\ms\xml\dso\XMLDSO.java(759,85) : error J0080: Value for argument 'parent' cannot be converted from 'int' in call to 'Element ElementFactory.createElement(Element parent, int type, Name tag, String text)' classes\com\ms\xml\dso\XMLDSO.java(759,85) : error J0080: Value for argument 'type' cannot be converted from 'Name' in call to 'Element ElementFactory.createElement(Element parent, int type, Name tag, String text)' classes\com\ms\xml\dso\XMLDSO.java(759,85) : error J0077: Not enough arguments for method 'Element ElementFactory.createElement(Element parent, int type, Name tag, String text)' classes\com\ms\xml\dso\XMLDSO.java(763,85) : error J0080: Value for argument 'parent' cannot be converted from 'int' in call to 'Element ElementFactory.createElement(Element parent, int type, Name tag, String text)' classes\com\ms\xml\dso\XMLDSO.java(763,85) : error J0080: Value for argument 'type' cannot be converted from 'Name' in call to 'Element ElementFactory.createElement(Element parent, int type, Name tag, String text)' classes\com\ms\xml\dso\XMLDSO.java(763,85) : error J0077: Not enough arguments for method 'Element ElementFactory.createElement(Element parent, int type, Name tag, String text)' classes\com\ms\xml\dso\XMLDSO.java(1066,72) : error J0080: Value for argument 'parent' cannot be converted from 'int' in call to 'Element ElementFactory.createElement(Element parent, int type, Name tag, String text)' classes\com\ms\xml\dso\XMLDSO.java(1066,72) : error J0080: Value for argument 'type' cannot be converted from 'Name' in call to 'Element ElementFactory.createElement(Element parent, int type, Name tag, String text)' classes\com\ms\xml\dso\XMLDSO.java(1066,72) : error J0077: Not enough arguments for method 'Element ElementFactory.createElement(Element parent, int type, Name tag, String text)' classes\com\ms\xml\dso\XMLDSO.java(1072,82) : error J0080: Value for argument 'parent' cannot be converted from 'int' in call to 'Element ElementFactory.createElement(Element parent, int type, Name tag, String text)' classes\com\ms\xml\dso\XMLDSO.java(1072,82) : error J0080: Value for argument 'type' cannot be converted from 'Object' in call to 'Element ElementFactory.createElement(Element parent, int type, Name tag, String text)' classes\com\ms\xml\dso\XMLDSO.java(1072,82) : error J0077: Not enough arguments for method 'Element ElementFactory.createElement(Element parent, int type, Name tag, String text)' classes\com\ms\xml\dso\XMLDSO.java(917,73) : error J0080: Value for argument 'parent' cannot be converted from 'int' in call to 'Element ElementFactory.createElement(Element parent, int type, Name tag, String text)' classes\com\ms\xml\dso\XMLDSO.java(917,73) : error J0080: Value for argument 'type' cannot be converted from 'Name' in call to 'Element ElementFactory.createElement(Element parent, int type, Name tag, String text)' classes\com\ms\xml\dso\XMLDSO.java(917,73) : error J0077: Not enough arguments for method 'Element ElementFactory.createElement(Element parent, int type, Name tag, String text)' Error executing jvc.exe. XMLJavaParser - 15 error(s), 3 warning(s) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Apr 2 21:29:54 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:22 2004 Subject: Proposal for src files In-Reply-To: <3.0.32.19980402115129.00774800@swbell.net> References: <3.0.32.19980402115129.00774800@swbell.net> Message-ID: <199804021929.OAA00509@unready.microstar.com> W. Eliot Kimber writes: > That's probably because the architecture facility of ISO/IEC 10744 > doesn't *do* inheritance in the way that most people seem to > expect. It lets you define hierarchies of semantic derivation, that > is "this element is derived from type X defined by architecture > Y". It provides a way to validate elements derived from type X > against the DTD rules defined by architecture Y. But that's > it. It's completely passive and declarative. Or, to put it more simply, architectural forms let you say that "Y is a kind of X", while namespaces let you say nothing but "here's X". (Read on only if you're interested in the implications of this difference...) First, imagine that "X" is an English name, while "Y" is a Korean name. If the Korean author is using a Korean DTD to write Korean documents, why should we force her to use element types with English names just because a namespace happens to have been created in, say, the Silicon Valley? Secondly, what if Y is both a kind of X _and_ a kind of Z? Finally, what if both W _and_ Y are a kind of Z? Namespaces seem simple at first, and they are a reasonable solution for some very specific problems, but they fail entirely in the first two situations, and require an awkward work-around in the third (using two or more different namespace declarations and prefixes with the same URI). All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu Apr 2 22:08:33 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:22 2004 Subject: "Inheritance considered harmful" In-Reply-To: <3.0.32.19980402115129.00774800@swbell.net> Message-ID: On Thu, 2 Apr 1998, W. Eliot Kimber wrote: > > That's probably because the architecture facility of ISO/IEC 10744 doesn't > *do* inheritance in the way that most people seem to expect. That's right. That's why people get so confused about them. The word inheritance is inherently misleading when applied to architectural forms. Architectural forms do subtyping, not inheritance. Inheritance is about "getting stuff for free" (e.g. code, declarations, fields). Subtyping is about *fulfilling a particular role* (perhaps through a manual construction of an appropriate "interface" (in this case a content model)). Architectural forms allow you to specify an interface that must be fulfilled and declare conformance to that interface. It does not allow you to "get code for free" (i.e. markup declarations). When I inherit from my father, I get his money without doing any work. I get to live in the same house he lived in without redoing everything he did to get it (not that I've had this experience...yet ). But When I subclass from the class "husband" I agree to do things like be faithful and caring and so forth. They are very different things. I have to put forth effort to fulfill that role. It's easy to get them confused, because there are benefits that accrue from fulfilling any role, and you may think that you are "inheriting" those benefits, but really, you are just getting them because you fulfilled the terms of the "contract." In SGML terms, when you subclass from an architectural DTD you get the benefit of all of the great software that has been written to handle that DTD. But you didn't inherit it -- it's just your reward for fulfilling the contract. The word inheritance is vaguely defined, because it depends a lot on context. Parameter entities allow us to "inherit" declarations that might have been created for some other document type. So SGML has inheritance at some very large grain, but what people usually mean when they ask for inheritance is fine grained element/attribute level inheritance, which is more tricky (and perhaps when push comes to shove not as valuable as it might seem -- there are other ways to achieve reuse). Subtyping, on the other hand, is quite mathematically defined, and it seems clear to me that this is what people mean when they say that architectural forms allow inheritance. Here is subtyping in a nutshell: A type is a set of objects. A subtype is a subset of that set of objects. An architectural form describes a set of elements or documents that can be transformed into a particular language defined by the architectural DTD. It defines an "element" or "document type" in the philisophical sense. A derived DTD or archform describes a subset of those elements. It defines an "element" or "document" subtype. Using the word inheritance to describe this relationship can only obfuscate things. This sort of subtyping relationship exists in hundreds of places in computer science, the physical world and logic and it is *not described as inheritance*. Here's a simple example. Python code can be compiled to JVM bytecodes. This is really cool because Python programs can be shipped across the net like Java, but Python is much easier to program in, and much faster for prototyping. Does Python inherit the ability to be shipped across the net from the JVM bytecode specification? No. It conforms (after a transformation) to the JVM byecode specification format and thus it gets JVM execution "for free." Does it subtype the JVM bytecode format? Yes, after a transformation it does. The set of all Python programs is a subset of the set of all programs that can be transformed into something that will run on the JVM. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Thu Apr 2 22:22:27 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:00:22 2004 Subject: "Inheritance considered harmful" Message-ID: <3.0.32.19980402141733.00713cd8@swbell.net> At 03:08 PM 4/2/98 -0500, Paul Prescod wrote: >On Thu, 2 Apr 1998, W. Eliot Kimber wrote: >> >> That's probably because the architecture facility of ISO/IEC 10744 doesn't >> *do* inheritance in the way that most people seem to expect. > >That's right. That's why people get so confused about them. The word >inheritance is inherently misleading when applied to architectural forms. > >Architectural forms do subtyping, not inheritance. Inheritance is about >"getting stuff for free" (e.g. code, declarations, fields). Subtyping is >about *fulfilling a particular role* (perhaps through a manual >construction of an appropriate "interface" (in this case a content >model)). Architectural forms allow you to specify an interface that must >be fulfilled and declare conformance to that interface. It does not allow >you to "get code for free" (i.e. markup declarations). Paul has made clear what I was feebly trying to say: thanks Paul. Cheers, E. --
W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From srn at techno.com Thu Apr 2 23:00:38 1998 From: srn at techno.com (Steven R. Newcomb) Date: Mon Jun 7 17:00:22 2004 Subject: why do namespaces have such a bad rep [2] In-Reply-To: <35230A09.58739027@mecom.mixx.de> (message from james anderson on Thu, 02 Apr 1998 05:46:45 +0200) References: <199803301710.MAA01121@unready.microstar.com> <199803312127.QAA00299@unready.microstar.com> <199803312343.SAA01404@bruno.techno.com> <3521BC55.6014F2E7@mecom.mixx.de> <199804011625.LAA00802@bruno.techno.com> <35230A09.58739027@mecom.mixx.de> Message-ID: <199804021955.OAA00839@bruno.techno.com> [James Anderson :] > first, an aside for those who may wonder why i pursue this. my > concern is that i do not wish to have to implement support for > enabling architectures just in order to contend with name > collisions. on principle it would be wrong. practically it would be > a waste of time. (1) You don't have to implement it. It's already implemented. It's already free, in source form, too, without restrictions of any kind, except that you can't use James Clark's name in certain ways. (2) If you're determined to argue this point on the narrowest possible grounds (even though those are not the grounds I'm arguing on), how's this? "Enabling architectures, among many other things, is a way of handling name collisions. If the sole purpose of namespaces is to avoid name collisions, and if you don't have to implement enabling architectures, then on principle it's a waste of time to implement namespaces. Q.E.D." (3) Enabling architectures would enable RDF to meet real requirements and desiderata that RDF can't satisfy today because it doesn't have enabling architectures. The simplifying assumption that namespaces represent ignores and renders useless the most powerful aspects of the schemas that represent architectures (i.e., DTDs). Among other things, RDF is supposed to support electronic commerce. Electronic commerce is a limitlessly hard problem in data interchange and management. It's not a good idea to discard or cripple the tools we already have for dealing with hard problems in data interchange and management. The namespace solution and the enabling architectures solution are *not* orthogonal to each other, unless you restrict your vision to extremely narrow and purely implementational concerns. If you're sensitive to the economics of mindshare, or to the very real cost of having two things, one less powerful than the other, where the one more powerful thing alone will do the job, you will see the connection. Let me put it even more starkly. It makes no sense, in a small living room (the short supply of mindshare and attention), to replace a comfortable sofa (enabling architectures) with a rude wooden bench (namespaces). It doesn't matter how many times you say that the wooden bench is perfectly good for the purpose for which it was designed, and it doesn't matter how many times you point to the bench's original specifications when people complain that, in practice, the bench is sometimes very uncomfortable and a sofa would be much more useful in a wider range of everyday living situations. > i use namespaces. i have implemented namespaces. I suspect yours is a very fine implementation, too, and, believe me, I sympathize with your situation. My company has done several implementations of nascent standards, only to throw them away and start over when it became clear that any tendency on our part to cling to our investment would stand in the way of progress in the larger context. Our HyQ implementation is an example. Perhaps you remember HyQ, the HyTime query/addressing language that was replaced by SDQL -- a replacement that I firmly and immediately supported, despite the emotional stress of abandoning a pile of work and a substantial investment. (Don't get me wrong; I've never regretted it.) > you're right there. one thing i still need to understand is how i would, for > example, define things like the ArcCFC entities for multiple inherited > architectures (ISO/IEC 10744 p230), or handle possible name ambiguity between > multiple architecures wrt entities used to specify section inclusion (ISO/IEC > 10744 p223) > (yes, i lack experience there. i woud presume one helps oneself to dotted names, > but that's just conjecture) The parameter entity namespace collision problem can occur only when you're syntactically inserting a DTD into the architecture you're creating. The potential for parameter entity name collisions, in such a scenario, is a feature of (or, one could conceivably believe, a bug in) SGML's DTD syntax. Fortunately, such inclusion is never necessary in order to obtain the full benefit of enabling architectures; you can always just map everything you need in the usual fashion, by declaring element types in your inheriting DTD that conform to architectural forms (the element types in the inherited-from DTD), and thus refer to, rather than include wholesale, the enabling architecture's DTD. If your enabling architecture has ArcCFC parameter entities, simply expand them in the normal SGML way, see what effect they have on the enabling architecture's effective DTD, and use the result as your enabling architecture. You always have the option of allowing whatever was declared as context-free in the enabling architecture to be used with equal lack of constraint in your inheriting architecture. Or, you can constrain the context-free elements to appear only in certain contexts; it's up to you. Whether you choose to declare context-free elements by means of a parameter entity whose name happens to be %ArcCFC; is entirely your business. (All parameter entity names in your inheriting architecture are entirely your business. In fact, even when inheriting any number of enabling architectures, means are provided to allow you to prevent them from affecting any of the inheriting architecture's namespaces in any way.) The marked section modularization technique used in the 1997 HyTime architecture has nothing to do with the concept of enabling architectures. The marked section stuff in the HyTime architecture is just a technique that helped us get the HyTime standard published in a way that allowed the whole HyTime architecture to be self-documentingly and self-realizingly modular. It is a pretty cute, if somewhat Byzantine technique, but it seems obvious, in retrospect, that the modularity of the HyTime architecture would have been better and more neatly achieved by making each of its modules a distinct enabling architecture. Unfortunately, that realization didn't dawn on us in time to meet the 1997 edition's publication deadline. *Sigh* Live and learn. Sorry for any confusion it causes. (The concept of enabling architectures has nothing to do with the HyTime architecture, either, except that, (1) when used as such, the HyTime architecture is just an example of an enabling architecture, and (2) both the HyTime architecture and the enabling architectures paradigm are standardized in single ISO book under a single ISO standard number, viz., 10744:1997.) Anyway, please understand that I'm not trying to defend particular syntaxes. By speaking out against XML namespaces, I'm attempting to conserve the public's mindspace for underlying concepts and intellectual tools. With the minimum set of the most protean possible concepts and intellectual tools, we may hope to grope our way toward more intuitive and more convenient schema syntaxes. On the other hand, with a wildly overlapping set of specialized, narrow-gauge tools (XML namespaces leap to mind as an example), I doubt we can ever get to a place where we have a really good, fully generalized schema syntax. In that case, I think we'd better stick to SGML's DTD syntax, warts and all, because it is known to work and, as an existing international and W3C standard, we won't have to fight over it. We can just keep on kludging it up with things like colon-ized namespaces and architectural form attributes. To all those who have read this whole harangue: you have my thanks and sympathy. -Steve -- Steven R. Newcomb, President, TechnoTeacher, Inc. srn@techno.com http://www.techno.com ftp.techno.com voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137) fax +1 972 994 0087 (at ISOGEN: +1 214 953 3152) 3615 Tanner Lane Richardson, Texas 75082-2618 USA xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From srn at techno.com Thu Apr 2 23:24:57 1998 From: srn at techno.com (Steven R. Newcomb) Date: Mon Jun 7 17:00:22 2004 Subject: "Inheritance considered harmful" In-Reply-To: (message from Paul Prescod on Thu, 2 Apr 1998 15:08:02 -0500 (EST)) References: Message-ID: <199804022020.PAA00896@bruno.techno.com> [Paul Prescod :] > That's right. That's why people get so confused about them. The word > inheritance is inherently misleading when applied to architectural > forms. Paul is absolutely right, but I'm still not going to take his advice. For several months last year, I deliberately stopped using the word "inherit", as in "inheriting architecture", "inherited-from architecture", etc. Instead, I very carefully used the words "derived" for the inheriting architecture and "enabling" for the inherited architecture. This is the vocabulary used in the standard. Ultimately, however, I reluctantly gave up on precision vocabulary because nobody understood what I was talking about, except for people whom I had no need to reach because they already understood the concepts. In almost all rhetorical situations, I have to use vocabulary that may be, strictly speaking, misleading, and yet provides some glimmer of understanding to the HyTime-inexperienced. I'm back to "inherited" and "inheriting", and I never even try to use "enabling" and "derived" any more. I'm open to other suggestions, though. Got any? -Steve -- Steven R. Newcomb, President, TechnoTeacher, Inc. srn@techno.com http://www.techno.com ftp.techno.com voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137) fax +1 972 994 0087 (at ISOGEN: +1 214 953 3152) 3615 Tanner Lane Richardson, Texas 75082-2618 USA xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tomw at action.cnchost.com Fri Apr 3 00:05:43 1998 From: tomw at action.cnchost.com (Tom Williamson) Date: Mon Jun 7 17:00:22 2004 Subject: List server configuration change proposal Message-ID: <35240BA0.A2F39544@action.cnchost.com> I propose that the list server which handles this list be configured to identify the sender of the messages it mails as "XML-DEV List" or something similar, so that messages from the list can be clearly identified and grouped by the recipients. Anyone else feel this way? Tom Williamson xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Apr 3 01:22:22 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:22 2004 Subject: Inheritance and other buzzwords Message-ID: <3.0.32.19980402152032.00bae390@pop.intergate.bc.ca> I think that altogether too much is being heaped up here on too narrow a base. What does XML do? It allows documents to be subdivided into parts, and for those parts to be given names. As an additional benefit, it allows the parts' names and contents, when they are textual, to contain characters from around the world. The trade-off is that the parts have to be arranged in a hierarchically nested fashion. XML does this in such a way that systems can agree on where the parts are, what they are named, and what their contents are, without having to share any software. That's all! The current namespace proposal adds one level of indirection to the names we give document components, and includes a technique for ensuring that the names are unique across the universe of the Internet. That's all! It seems obvious to me that there are a variety of ways that DTDs could usefully be made namespace-aware; this does *not* mean that I think we know how to write down rules for automatic schema synthesis; I still think humans should design documents. I want such a designer to be able to say "My namespace is ID'ed by URI1, and my X element has to start with an A element from the URI2 namespace, followed one of my own Z elements, followed by..." you get the idea. Then the software can sort out the prefixes and do perfectly good validation. This could probably be done very nicely in front-end filters and allow the use of current 8879-based technology. In fact, why doesn't someone on this list write such a preprocessor? I think conventional DTD's, with the additional leverage of universal names, would be damn useful. I think it's a base requirement that any document design language of the future deal with qualified (i.e. universal) names. Any such facility that seriously gets in the way of using this kind of name just won't fly in the Web milieu. The idea of trying to get serious mileage out of any kind of a name that can't be mapped to a URI is something that will work only in a closed inward-looking shop. Which there are a lot of out there, but we're supposed to be designing technology for the Internet. I do not understand architectural forms well enough to have an opinion as to how well they co-exist with universal names. But Eliot doesn't seem too worried by it. Cheers, Tim Bray tbray@textuality.com http://www.textuality.com/ +1-604-708-9592 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Fri Apr 3 01:59:54 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:00:22 2004 Subject: Inheritance and other buzzwords Message-ID: <3.0.32.19980402175702.00772a88@swbell.net> At 03:20 PM 4/2/98 -0800, Tim Bray wrote: >I do not understand architectural forms well enough to have an opinion >as to how well they co-exist with universal names. But Eliot doesn't >seem too worried by it. They are orthoganal. Architectures define instance-to-type mappings. Name spaces ensure that instance or type names are universally unique. For example, say I want to define a generally-usable set of elements. In SGML architecture terms, I would define them by creating an architectural DTD that declared the element types and attributes: Note my use of name spaces to ensure that these element type names are unique. This has no value to these declarations as an architectural DTD (because it the element types are inherently unique because an architecture establishes its own element type name space). But it might be useful if people want to syntactically combine these declarations with others in individual documents or in other architectural DTDs, which is not something architectures explicitly enable (or even encourage) but that is possible because architecture DTDs use normal DTD syntax. Within a document, I can either use the colonized names or not. If I use the colonized names, then an architecture-aware processor will map the instance names to the architectural names automatically, making the instance indistinquishable from a name-spacified document except by the presense of an architecture use declaration: This is a kimber-defined paragraph Here I've enabled name-space aware processing (whatever that means) and architecture-based processing. Both appear to use the same syntactic mechanism, but in fact the architecture-based processing is just plain string matching--there's no awareness of "kimarch:" as being a prefix. In other words, architecture processing doesn't care how the element type names (or attribute names) are spelled. Of course, we could enhance architectural automapping to recognize prefixes, which is probably a useful thing to do. To use my own element type names, I use the attribute-based remapping mechanism instead of colonized names, which is exactly the same as that defined by XLink: This is a kimber-defined paragraph The only change here is that I provide a non-colonized version of the architectural DTD (I could have used the colonized version but it would look really stupid to repeat "kimarch" for each mapping): The architectural mapping is the same. Note that as shown here the syntactic result is only slightly different from the name-space version ('kimarch="name"' instead of 'kimarch:name', a difference of two characters). Of course, it doesn't, by itself, serve to make the element type name universally unique. But since I see no value in doing so, I don't care. If I did care, I'd use name spaces. Finally, I have a document somewhere that defines what the element types in the kimarch architecture mean: The Kimber Architecture

The architecture name "kimarch" can be used as a name-space prefix in name-spacified versions of the kimarch architectural DTD if desired.

kimdoc

A document

para

A paragraph.

The existence of an architecture definition document completes the definition of the architecture (at least in a trivial, if uninteresting, way). This definition document could be the same one used to describe the names in the name space, don't know. In any case, it serves to define a set of element types and their meanings, which is what I think a schema is. Cheers, E. --
W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From john at datachannel.com Fri Apr 3 02:14:35 1998 From: john at datachannel.com (John Tigue) Date: Mon Jun 7 17:00:23 2004 Subject: "Inheritance considered harmful" Message-ID: <005601bd5e96$086983c0$720a1bac@gigue.datachannel.com> A note and a question. Note though that in Java (and seemingly in COM+) there are two types of interitance. There are the notions of superInterface and superClass. Class inheritance is the tradition OO concept of "getting code for free." The follow is a definition of a Java interface with no superInterface inheritance going on: interface Somethingable { public abstract void aMethod( long myParam ); public abstract int bMethod(); } which is equivalent to: interface Somethingable { void aMethod( long myParam ); int bMethod(); } All methods in Java interfaces are public and abstract. There is never any code associated. A class which says it implements this interface must define code for all the methods in the interface e.g.: class Somethinger implements Somethingable { void aMethod( long myParam ) { // do something with the myParam } int bMethod() { return 7; } } The interface inheritance happens when a interface extends another interface. Here is an interface which inherits the above interface: interface EvenMorable extends Somethingable { boolean dMethod( int someParam ); } A class which says it implements the interface EvenMorable must have method bodies for aMethod, bMethod and dMethod (or declare that it is abstract and so must be extended by some other class (class inheritance) which implements the unsatisfied methods of the interface. The question: While class inheritance does not map to Architectural forms, does interface inheritance map to Architectural forms? -----Original Message----- From: W. Eliot Kimber To: xml-dev@ic.ac.uk Date: Thursday, April 02, 1998 12:42 PM Subject: Re: "Inheritance considered harmful" >At 03:08 PM 4/2/98 -0500, Paul Prescod wrote: >>On Thu, 2 Apr 1998, W. Eliot Kimber wrote: >>> >>> That's probably because the architecture facility of ISO/IEC 10744 doesn't >>> *do* inheritance in the way that most people seem to expect. >> >>That's right. That's why people get so confused about them. The word >>inheritance is inherently misleading when applied to architectural forms. >> >>Architectural forms do subtyping, not inheritance. Inheritance is about >>"getting stuff for free" (e.g. code, declarations, fields). Subtyping is >>about *fulfilling a particular role* (perhaps through a manual >>construction of an appropriate "interface" (in this case a content >>model)). Architectural forms allow you to specify an interface that must >>be fulfilled and declare conformance to that interface. It does not allow >>you to "get code for free" (i.e. markup declarations). > >Paul has made clear what I was feebly trying to say: thanks Paul. > >Cheers, > >E. >-- >
>W. Eliot Kimber, Senior Consulting SGML Engineer >Highland Consulting, a division of ISOGEN International Corp. >2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 >www.isogen.com >
> >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Apr 3 03:16:11 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:23 2004 Subject: List server configuration change proposal In-Reply-To: <35240BA0.A2F39544@action.cnchost.com> References: <35240BA0.A2F39544@action.cnchost.com> Message-ID: <199804030113.UAA00428@unready.microstar.com> Tom Williamson writes: > I propose that the list server which handles this list be configured to > identify the sender of the messages it mails as "XML-DEV List" or > something similar, so that messages from the list can be clearly > identified and grouped by the recipients. > > Anyone else feel this way? When I was bothering to filter, I just used the Sender field "owner-xml-dev@ic.ac.uk", which is always the same. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri Apr 3 03:42:41 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:00:23 2004 Subject: the death of the black box References: <3.0.32.19980402100252.00ebe6ec@swbell.net> <3523C374.8EE1FDE3@finetuning.com> Message-ID: <35243E28.130D@hiwaay.net> Lisa Rein wrote: > My point is exactly what Eliot always says -- A lot of this is *NOT* > rocket science -- as many would have people believe. If it's ooooh soo > complicated, then scardie-cat developers will have to buy a black box to > do everything for them. If the world were to discover just how basic > some of this stuff is -- they might never buy a black box again! > > And would that really be so bad? :-) > > lisa No. After all these years, that would be grand. I agree. It is not rocket science, but neither is scoring music if you are a musician. In this case, because the root of the web languages is HTML, there is an entry level and that is what makes the web go. At this time, most companies who want to build an Intranet have to do it themselves. To afford to own an Intranet, it has to be organic in its growth if not its design. The design should be simple and it should be straightforward to apply by any discipline of the business. Otherwise, the businessman has to dedicate personnel directly to the care and maintenance of multiple domains. In effect, what one wants is for each business domain to add its rules to the framework in business time. As the business is practiced, the rules emerge inside the basic navigational structures the employees build to do their jobs. NOTE: As Linux proves, egoboo works. Still, the framework in which the structures emerge typically IS designed by specialists. It is grown by the others. As the browser is emerging as the dominant interface technology, that requires a lot of skill retooling, particularly in relational database designs. For a simple example, look at the design of commercial relational systems that while excellent for developing QBE interfaces and involvements, do not take advantage of the full screen. How should this be realized in a document interface where the relational DB is still the principal server? The complexity of this has to be subsumed in the tools, and I am reasonably convinced that this requires the black box somewhere in the toolkit. SGML/XML markup technology can't get you out of the box. It can make the box a fairer place, a more truthful place, and an easier place to do business, but it is still, for the average bear, slightly harder than they can do well without *good, low-cost* tools. A significant contribution of the XML community to the markup community is that the second condition will finally be met. Cheers, Len Bullard Intergraph Public Safety xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri Apr 3 03:48:18 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:00:23 2004 Subject: Proposal for src files References: <3.0.32.19980402115129.00774800@swbell.net> Message-ID: <35243F7E.1643@hiwaay.net> W. Eliot Kimber wrote: > > I think--I'm actually highly > skeptical of the true benefit of object-orientation. Objects are a good way to organize and reuse functions. From a declarative perspective, what is the true benefit? len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecom.mixx.de Fri Apr 3 03:51:07 1998 From: James.Anderson at mecom.mixx.de (james anderson) Date: Mon Jun 7 17:00:23 2004 Subject: why do namespaces have such a bad rep [3] References: <199803301710.MAA01121@unready.microstar.com> <199803312127.QAA00299@unready.microstar.com> <199803312343.SAA01404@bruno.techno.com> <3521BC55.6014F2E7@mecom.mixx.de> <199804011625.LAA00802@bruno.techno.com> <35230A09.58739027@mecom.mixx.de> <199804021955.OAA00839@bruno.techno.com> Message-ID: <35244059.545B230B@mecom.mixx.de> hello again. Steven R. Newcomb wrote: > Let me put it even more starkly. It makes no sense, in a small living > room (the short supply of mindshare and attention), to replace a > comfortable sofa (enabling architectures) with a rude wooden bench > (namespaces). It doesn't matter how many times you say that the > wooden bench is perfectly good for the purpose for which it was > designed, and it doesn't matter how many times you point to the > bench's original specifications when people complain that, in > practice, the bench is sometimes very uncomfortable and a sofa would > be much more useful in a wider range of everyday living situations. > perhaps you have misundertood me. my living room is big enough for me to have a couch and a coffee table. i could sit on the coffee table. i don't. i could balance my coffee cup on my lap when i sit on the couch. i don't do that either. and no, i don't have a couch with drive-in movie arm rests. they would make it hard to do some of the things i bought the couch for. and when someone shows up with a birthday cake, i'm happy not to have to send them out to the kitchen to cut it up and put it in coffee cups, 'cause that's all we would have been able to balance on the armrests. and when i didn't yet have the couch, but i'd picked up the coffee table real cheap at a yard sale, it was still nice not to have to sit with my cake in my lap when i was sitting on pillows. and now that i have the couch, there's no reason to throw the coffetable out. it turns out that, when company comes they like not having to eat out of their laps just as much as i do. i find it easier to make decisions when separable issues are handled separately. so i have to speak out for namespaces, since, in my experience, they conserve "mindshare and attention". to make my concern concrete, let's take up again the issue behind my question about ArcCFC. you responded, that, > entity names in your inheriting architecture are entirely your > business. In fact, even when inheriting any number of enabling > architectures, means are provided to allow you to prevent them from > affecting any of the inheriting architecture's namespaces in any way. i'm concerned about the case were i want to get at them, not be isolated from them. i would expect to have to do something like the following. suppose that i have two base architectures, each with an element "name". i would like to derive an architecture to comprise descriptions of both sorts. nothing new. while the provisions of enabling architectures make it possible to use architectural form attributes to disambiguate the "name" element names by creating a 2-d namespace for element names, and to disambiguate the "address-form" attribute names by using the remapper attribute to, likewise, create a 2-d namespace for attribute names, and then to map them onto unique positions within a 1-d namespace, i have not found the provision to enable declaring distinct values for the entities. how would one distinguish %name-content from %name-content in a derived architecture in order to specify the respective entity declarations? maybe i don't need to. i thought i would. if the mechanism for entities does exist, then i'll be using three distinct constructs to accomplish the same thing. (yes, i'm ignoring the translation facilities of attribute mapping here - as i always seem to say "they're orthogonal") which does not recommend the approach as a paradigm for the efficient use of "mindshare and attention", but at least i can be confident that i will be able to do what i need to in the existing e-a implementations. bye for now, xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecom.mixx.de Fri Apr 3 03:51:12 1998 From: James.Anderson at mecom.mixx.de (james anderson) Date: Mon Jun 7 17:00:23 2004 Subject: Namespaces in XML: 3.1 the example [3] References: <352182A3.3CBE5701@mecom.mixx.de> <199804011212.HAA00271@unready.microstar.com> Message-ID: <3524407F.DB1B6EA3@mecom.mixx.de> hello again, the space in which a symbol resides says nothing about the space its value is in. thus it should be possible to place the "name" in a space independent of the uri / space associated with the entity to which it is bound in its declaration. David Megginson wrote: > james anderson writes: > > > (or rather, it's almost possible: there's a small problem, that the > > wd-standard precludes qualified entity names. why?) > > The namespace spec allows element type names, attribute names, and PI > targets to be associated with a URI. (External) entity and notation > names are already associated with a URI. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lisarein at finetuning.com Fri Apr 3 04:24:20 1998 From: lisarein at finetuning.com (Lisa Rein) Date: Mon Jun 7 17:00:23 2004 Subject: the death of the black box References: <3.0.32.19980402100252.00ebe6ec@swbell.net> <3523C374.8EE1FDE3@finetuning.com> <35243E28.130D@hiwaay.net> Message-ID: <35244D42.678F857C@finetuning.com> What if, instead of a browser, sets of browser components were made available, that could be chosen from checkboxes on a form, and then thrown into an architecture, per the particular needs of each "surf" on the web? It's much less of a black box. And it would be harder for only one or two companies to have a monopoly on that box. lisa len bullard wrote: > > Lisa Rein wrote: > > > My point is exactly what Eliot always says -- A lot of this is *NOT* > > rocket science -- as many would have people believe. If it's ooooh soo > > complicated, then scardie-cat developers will have to buy a black box to > > do everything for them. If the world were to discover just how basic > > some of this stuff is -- they might never buy a black box again! > > > > And would that really be so bad? :-) > > > > lisa > > No. After all these years, that would be grand. > > I agree. It is not rocket science, but neither > is scoring music if you are a musician. In this > case, because the root of the web languages is > HTML, there is an entry level and that is what > makes the web go. > > At this time, most companies who want to build an > Intranet have to do it themselves. To afford to > own an Intranet, it has to be organic in its > growth if not its design. The design should be > simple and it should be straightforward to apply > by any discipline of the business. Otherwise, > the businessman has to dedicate personnel > directly to the care and maintenance of multiple > domains. In effect, what one wants is for > each business domain to add its rules to the > framework in business time. As > the business is practiced, the rules emerge > inside the basic navigational structures > the employees build to do their jobs. > > NOTE: As Linux proves, egoboo works. > Still, the framework in which the > structures emerge typically IS designed > by specialists. It is grown by the others. > > As the browser is emerging as the dominant interface > technology, that requires a lot of skill > retooling, particularly in relational > database designs. For a simple example, > look at the design of commercial relational > systems that while excellent for developing > QBE interfaces and involvements, do not take > advantage of the full screen. How should this > be realized in a document interface where > the relational DB is still the principal > server? > > The complexity of this has to be subsumed > in the tools, and I am reasonably convinced > that this requires the black box somewhere > in the toolkit. SGML/XML markup technology > can't get you out of the box. It can make > the box a fairer place, a more truthful place, > and an easier place to do business, but it > is still, for the average bear, slightly > harder than they can do well without > *good, low-cost* tools. A significant contribution > of the XML community to the markup community > is that the second condition will finally be met. > > Cheers, > > Len Bullard > Intergraph Public Safety xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecom.mixx.de Fri Apr 3 05:03:25 1998 From: James.Anderson at mecom.mixx.de (james anderson) Date: Mon Jun 7 17:00:23 2004 Subject: "Inheritance considered harmful" References: <005601bd5e96$086983c0$720a1bac@gigue.datachannel.com> Message-ID: <352451B9.46AD09EE@mecom.mixx.de> John Tigue wrote: > A note and a question. > > Note though that in Java (and seemingly in COM+) there are two types of > interitance. There are the notions of superInterface and superClass. the first is subtyping. the second is inheritance. they are kept distinct because the designers considered the multiple inheritance to be an "evil", so they permit only multiple subtyping. > ... > > The question: While class inheritance does not map to Architectural forms, > does interface inheritance map to Architectural forms? > as Paul Prescod noted, yes, the aspect which does subtyping is analogous to interfaces.on the other hand, i had understood architectural attribute definitions to apply to the derived element and thus to be effecting inheritance. so both notions map; the first to "architectural forms" and the second to "architectural attributes". > >>> That's probably because the architecture facility of ISO/IEC 10744 > doesn't > >>> *do* inheritance in the way that most people seem to expect. > >> > >>That's right. That's why people get so confused about them. The word > >>inheritance is inherently misleading when applied to architectural forms. > >> > >>Architectural forms do subtyping, not inheritance. Inheritance is about > >>"getting stuff for free" (e.g. code, declarations, fields). Subtyping is > >>about *fulfilling a particular role* (perhaps through a manual > >>construction of an appropriate "interface" (in this case a content > >>model)). Architectural forms allow you to specify an interface that must > >>be fulfilled and declare conformance to that interface. It does not allow > >>you to "get code for free" (i.e. markup declarations). > > > >Paul has made clear what I was feebly trying to say: thanks Paul. > > > >Cheers, > > > >E. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bmhughes at ozemail.com.au Fri Apr 3 07:16:14 1998 From: bmhughes at ozemail.com.au (Baden Hughes) Date: Mon Jun 7 17:00:23 2004 Subject: List server configuration change proposal In-Reply-To: <199804030113.UAA00428@unready.microstar.com> References: <35240BA0.A2F39544@action.cnchost.com> <35240BA0.A2F39544@action.cnchost.com> Message-ID: <3.0.5.32.19980403125706.007c6c30@ozemail.com.au> >Tom Williamson writes: > > > I propose that the list server which handles this list be configured to > > identify the sender of the messages it mails as "XML-DEV List" or > > something similar, so that messages from the list can be clearly > > identified and grouped by the recipients. > > > > Anyone else feel this way? > David Megginson writes: >When I was bothering to filter, I just used the Sender field >"owner-xml-dev@ic.ac.uk", which is always the same. Or just filter for _any_ header containing xml-dev@ic.ac.uk (you might have multiple pass filters if you're using a decent email program like Eudora :-)) Baden xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Fri Apr 3 08:05:22 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:00:23 2004 Subject: Inheritance and other buzzwords Message-ID: <005e01bd5ec6$af7c33d0$890b4ccb@NT.JELLIFFE.COM.AU> From: Tim Bray > The current namespace proposal adds one level of indirection to the names > we give document components, and includes a technique for ensuring that > the names are unique across the universe of the Internet. That's all! I think Tim is correct in trying to limit people's perception of what the namespace proposal does. The basic requirement is, more or less, to have a declaration which is as simple as possible, as non-intrusive as possible, and which does not require explicit element or attribute declarations, which will allow the RDF people to say "this element is one of our element types". Whether there is, underlying this, some more interesting structure of links derived from type names not instances (my belief), or an underlying honeycomb of parallel, mutually augmenting schemas (the architectures idea) should not be the deciding factor for the namespace proposal, to me. I think it is enough that the namespace 1.0 be expressed in a way that does not rule out an interpretation using either ot these mechanisms (or others) is enough at this stage. It was not Tim's point, but strictly I think the proposal adds two levels of indirection: prefix->ns; then ns->schema It is because of this indirection that the current proposal does not ensure unique naming: there is a possibility of an error where two fragments using the same prefix are combined under the same namespace declaration. This is particularly an issue of maintenance: where a schema is updated, perhaps to make it have a more restrictive content model. So XML tools which combine fragments with namspaces will have to be able to rewrite name prefixes. If there is and then the processing tool will have to be smart enough to, for example, relabel the second PI and all the names which use this must be relabelled too in the instance. I believe the net effect of this is that prefixes will begin off simple, and end up more and more complex, more and more like partial schema URLs. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri Apr 3 11:45:01 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:23 2004 Subject: Inheritance and other buzzwords In-Reply-To: <3.0.32.19980402152032.00bae390@pop.intergate.bc.ca> Message-ID: <3.0.1.16.19980403085207.37573c76@pop3.demon.co.uk> At 15:20 02/04/98 -0800, Tim Bray wrote: [...An analysis I agree with...] If readers feel that namespaces are all doom-and-gloom, let me say that I am very happy indeed with the present namespace proposal. It adds precisely 2 major features - the uniquification of names and the identification of those names. Two years ago I was struggling with early versions of CML. I had called my generic variable VAR (reasonable?). I think had the blinding revelation that chemistry involved text as well as molecules. Write my own DTD for this? No - re-use the HTML DTD. Only problem, HTML also has a VAR. I asked the SGML community and was more-or-less told that SGML was broken in this respect. In the next version of CML I therefore created variables CML.MOL and XML.VAR (sic - I created something called eXperimental Markup Language, before the current XML). So the current namespace proposal gives me exactly what I wanted, and also gives me the confidence and authority to use it. CML:MOL will work for 99% of the applications I shall be involved in. I have no problem about (say): This is a molecule [I *would* have had a problem with context-sensitive minimisation...] My problem only comes when I encounter the Concrete Materials Laboratory who also use CML as their prefix. If I want to mix existing document from CML and CML I have to edit one set. Tedious. But we do this sort of thing every day for other reasons (how many of you have run automatic edits through documents when companies' names change, etc.). No big deal. So I am very excited about the formal opportunity to have interoperable chemistry documents. [To put this in context - before XML (i.e. 1997) we use fortran-based data files. These often have no spaces between fields (some of you may never have seen a fortran file, but they really are like that.). But your airplanes are built on fortran - no dynamic memory allocation, 6-letter variables, implicit typing - etc. Namespaces are a bit like the fortran of XML. There is a huge amount you can do with them - often tedious, but you can do it. A major advance on machine-code programming.] [...] > >In fact, why doesn't someone on this list write such a preprocessor? I >think conventional DTD's, with the additional leverage of universal names, >would be damn useful. Seconded. I suspect that a few simple utilities will help a great deal. For example a filter to rename prefixes in a document (though I expect sed would be adequate). P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri Apr 3 11:51:46 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:23 2004 Subject: Proposal for src files In-Reply-To: <3.0.32.19980401112049.00a8a740@pop.intergate.bc.ca> Message-ID: <3.0.1.16.19980403093734.326f9134@pop3.demon.co.uk> At 11:20 01/04/98 -0800, Tim Bray wrote: >For the record, I disagree with the first half of Eliot's thesis here. >I think that there is an *excellent* chance of getting consortium and >leading vendors to coalesce in support of a schema proposal which attains >the notorious MPRDV (Minimum Progress Required to Declare Victory) level >but does not rule out downstream extensibility. > >MPRDV components IMHO are > >1. does what DTDs do in as intuitive as possible a way >2. uses XML syntax >3. is compatible with the RDF data model >4. does basic lexical data typing of character data and attribute > values This is a good place to start from (I have also received some private support for my suggestions.) In fact I was being very conservative and my proposal was limited to 1 and 2 because I thought that it would be relatively uncontroversial. Personally I also need 4 (and am keen on 3, although I'm not sure of the implications). It's fairly important to do something quickly as otherwise there will be arbitrary conflicting syntaxes. An example: DTD: How is this represented in XML? The first version of XML-data used something like OCCURS="STAR", whilst the latest uses occurs="ZEROORMORE". We all agree that any DTD in XML syntax would need to represent the "*" concept; I am *merely asking that we standardise the syntax we use* :-) Similarly XML-data uses 'elementType' to represent XML's construct. Perfectly reasonable, but arbitrary. Others might choose ELEMENT , element_type or whatever. contentspecs are slightly more challenging as we could either simply hold the string, or could expand this with Choice, Seq, etc. The *simplest* way to resolve this would be to use the terminology in the spec itself. Thus we should use AttType [54], AttDef [53], etc. Although there are probably things that I've overlooked I can't see this exercise taking more than two hours in a pub. At the end of this we would have: ***A DTD (in XML-DTD syntax) for representing DTDs in XML*** nothing more. An example might be: ID ID It seems to me that the spec is so clear that the only decisions are on a few attribute names (e.g. type above) or whether some attributes should be elements. Since this is an xml document we can use XML technology to process it. *** In particular we could create a stylesheet which filtered out any elements or attributes not in the XMLDTD set. Thus if someone added dataType="integer" to an ELEMENT we could easily ignore it whilst reading the rest. The point is that whatever *additions* are made to the document above the 'true' DTD can be easily extracted.*** This means that if I encounter a 'schema' which honours the philosophy above I can *automatically* extract the DTD from it. Our motivations, are of course, for extending it in different ways. The proposal above seems to preserve complete latitude in how we do this. I make the following suggestions. (a) as the DTD is now an XML document we have precise methods for linking to any component of it. If we wished to say that bar represented an integer with given ranges, this could be expressed through an out-of-line link using XLL. This seems to me the purest way of extending it - essentially an annotation. [We've spent a long time creating XLL - why not start using it :-)] (b) we could put in-line links in the XMLDTD. Thus bar could have a xml:link to jumbo/xml/bar.class (Java). (c) we can add elements in the content of ELEMENT, AttDef or other fields. Thus bar might be: integer 3 65 Note that a purist processor can ignore any children of contentspec other than those in XML1.0 If we can agree on the base terminology and syntax we can then move to discussing the much more difficult questions of how to extend and whether there is likely to be any consensus. If we can't agree on the extensions, at least we have a base that everyone honours. I am often over-simplistic, but I can't see any downside to this (other than making it slightly easier to open Pandora's box - which will happen anyway). P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Fri Apr 3 16:41:08 1998 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:00:23 2004 Subject: True XML compliance (Re: Sean McGrath's posts to XML-L) Message-ID: <01bd5f23$b0d22ca0$d4addccf@uspppBckman> Mike, I would love to see it. Is it in .exe form? Your point about taking multiple documents is interesting. It should not be too difficult to take these and show where for the same element name there are conflicting attributes, etc. Frank -----Original Message----- From: Michael Kay To: Frank Boumphrey Date: Friday, April 03, 1998 1:24 AM Subject: Re: True XML compliance (Re: Sean McGrath's posts to XML-L) >>It should be easy to generate a DTD from a well formed document. Does any >>one know of any software out there that does this? Or is there even a need >>for this? >> >I have written one, and I find it very handy. Quite prepared to share it. > >Of course there are many possible DTDs to which a given document conforms. >In practice I have used two techniques: creating a synthetic document that >exhibits examples of all the structures I want, and generating a DTD form >this; alternatively >generating a prototype DTD (literally!) from a single example document and >then >modifying it by hand. A nice extension would be to take multiple input >documents. > >I am writing direct because the mailbox under which I send to XML-DEV is >temporarily out of action. > >Regards, Mike Kay > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Fri Apr 3 21:23:34 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:00:23 2004 Subject: Announce: XPointer Implementation with DSSSL Message-ID: <3.0.32.19980403132051.0069e120@swbell.net> I have begun implementing XPointer resolution in DSSSL. The code is still incomplete but does enough to be useful at least for testing purposes. You can find the package at "http://www.drmacro.com/hyprlink/xlink>http://www.drmacro.com/hyprlink/xlink". At this point, it supports all four absolute location terms and all the relative terms except preceding (I have a bug somewhere that I haven't chased down). It does not yet do attribute qualification. The functions are provided as a DSSSL function package you can use by reference from other packages. It is provided as a service to the community and carries no intellectual property restrictions on its use. It is undocumented in the form provided here (the source form I develop in is documented, but I'm just using simple architectural instance extraction to generate the DSSSL specs provided in the package--if anyone wants the base source, I'm happy to provided it. Indicate whether or not you want the ADEPT*Editor setup as well). I welcome contributions, bug reports, suggestions for improvement, etc. Cheers, E. --
W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From srn at techno.com Fri Apr 3 21:37:03 1998 From: srn at techno.com (Steven R. Newcomb) Date: Mon Jun 7 17:00:24 2004 Subject: why do namespaces have such a bad rep [3] In-Reply-To: <35244059.545B230B@mecom.mixx.de> (message from james anderson on Fri, 03 Apr 1998 03:50:36 +0200) References: <199803301710.MAA01121@unready.microstar.com> <199803312127.QAA00299@unready.microstar.com> <199803312343.SAA01404@bruno.techno.com> <3521BC55.6014F2E7@mecom.mixx.de> <199804011625.LAA00802@bruno.techno.com> <35230A09.58739027@mecom.mixx.de> <199804021955.OAA00839@bruno.techno.com> <35244059.545B230B@mecom.mixx.de> Message-ID: <199804031742.MAA01338@bruno.techno.com> [James Anderson :] > i'm concerned about the case were i want to get at [the base > architecture's parameter entities], not be isolated from them. i > would expect to have to do something like the following. suppose > that i have two base architectures, each with an element "name". i > would like to derive an architecture to comprise descriptions of > both sorts. nothing new. > > > address-form (PSEUDONYM | LEGAL) "LEGAL"> > > > > address-form (ASSOCIATIVE | LITERAL) "LITERAL"> > while the provisions of enabling architectures make it possible to > use architectural form attributes to disambiguate the "name" element > names by creating a 2-d namespace for element names, and to > disambiguate the "address-form" attribute names by using the > remapper attribute to, likewise, create a 2-d namespace for > attribute names, and then to map them onto unique positions within a > 1-d namespace, i have not found the provision to enable declaring > distinct values for the entities. how would one distinguish > %name-content from %name-content in a derived architecture in order > to specify the respective entity declarations? > maybe i don't need to. i thought i would. I don't think you need to. When you create an element subtype from an element type in a base architecture (a supertype), it doesn't matter whether the content model of the supertype was expressed by means of one or more parameter entities. For all purposes of subtyping, it only matters what the replacement text of those parameter entities was. I can see where it might sometimes be convenient to re-use the same parameter entities that were used in the base architecture, but there is no provision in the current enabling architectures syntax for that, and it's not necessary, anyway. I doubt it would be worth the added syntactic complexity. You would have to cause the names in the replacement texts of the supertype's architecture's parameter entities to be somehow automagically translated into the corresponding names in the subtyping architecture. [I'm trying to use the "subtype/supertype" vocabulary instead of the "inheriting/inherited" vocabulary here; does it work better?] -Steve -- Steven R. Newcomb, President, TechnoTeacher, Inc. srn@techno.com http://www.techno.com ftp.techno.com voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137) fax +1 972 994 0087 (at ISOGEN: +1 214 953 3152) 3615 Tanner Lane Richardson, Texas 75082-2618 USA xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Sat Apr 4 00:37:18 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:00:24 2004 Subject: Announce: XPointer Implementation with DSSSL Message-ID: <3.0.32.19980403163219.007487bc@swbell.net> At 01:20 PM 4/3/98 -0600, W. Eliot Kimber wrote: >I have begun implementing XPointer resolution in DSSSL. The code is still >incomplete but does enough to be useful at least for testing purposes. You >can find the package at >"http://www.drmacro.com/hyprlink/xlink>http://www.drmacro.com/hyprlink/xlin k". I have created a "Test your XPointers" page to the XLink area (http://www.drmacro.com/hyprlink/xlink/xprttest.html). From this page you can run Jade and the DSSSL XPointer implementation against any XML document addressible on the Web (I use Jade's HTTP support to get the documents to process). It now supports real XML documents. I also provide access to SP's XML validation and warning options. I'm not sure I'm being as smart as I could be about passing the URI part of locators to Jade for resolution--if you try it and doesn't appear to be resolving your URLs appropriately, let me know and I'll try to figure it out. Cheers, E. --
W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Apr 4 03:45:30 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:24 2004 Subject: SAX: Java-Specific Question Message-ID: <199804040144.UAA00822@unready.microstar.com> Here's a Java question. Let's say that I have a class with a method public void parse (String publicId, String systemId, Reader reader) throws java.lang.Exception; What will happen if the "java.io.Reader" class is not available on my system (perhaps because I'm using a 1.0.2 browser), but I never invoke this method (I'm assuming that I compiled my code under JDK 1.1 or JDK 1.2)? I'm becoming convinced that we need to support character streams in SAX, and I'm trying to figure out how to handle it in the Java version -- I think that I've worked out pretty much everything else now, except for a few minor details. Thanks for any help, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Sat Apr 4 03:58:43 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:00:24 2004 Subject: the death of the black box References: <3.0.32.19980402100252.00ebe6ec@swbell.net> <3523C374.8EE1FDE3@finetuning.com> <35243E28.130D@hiwaay.net> <35244D42.678F857C@finetuning.com> Message-ID: <35259375.6B9D@hiwaay.net> Lisa Rein wrote: > > What if, instead of a browser, sets of browser components were made > available, that could be chosen from checkboxes on a form, and then > thrown into an architecture, per the particular needs of each "surf" on > the web? What is the difference between this and It's much less of a black box. And it would be harder for only one or > two companies to have a monopoly on that box. Hmm. Is that true? Unless you agree a priori on the interfaces, someone still has to create the rules for those. System stability is not guaranteed by markup, but in theory, it helps. Let me go at this another way: technology must not obsolete content. So far, to get a stable browser that will enable content to remain viable for a period as short as a few months, it has come down to one major browser and one goodOutofTheGate but caught in the stretch contender. IOW, the market has eliminated the competitors. Yet, with the rumors of Chrome based on XML, non-markup notations feel a creeping isolation coming. So, did markup obsolete the content? So my question: if architectural forms were used, would the syntax of the instance be not irrelevant, but at least mappable? If this approach is taken, do we lose the economy gained by syntax unification? If we do, then who is going to support the architectural approach when syntax unification and a six month development lead offer such compelling market advantages? The spirit and soul of markup ride on the answers. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Sat Apr 4 04:37:29 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:00:24 2004 Subject: Java-Specific Question Message-ID: <001a01bd5f71$c6f107b0$2ee044c6@donpark> David, >Here's a Java question. Let's say that I have a class with a method > > public void parse (String publicId, String systemId, Reader reader) > throws java.lang.Exception; > >What will happen if the "java.io.Reader" class is not available on my >system (perhaps because I'm using a 1.0.2 browser), but I never invoke >this method (I'm assuming that I compiled my code under JDK 1.1 or JDK >1.2)? ClassNotFoundException will be thrown when the class is loaded. What you probably want is to load class by name (i.e. Class.forName()) rather than importing it directly. This is the trick I used to make MSXML platform-independed. At runtime, MSXML tries to load the native stream class and falls back to pure Java version if the native stream class is not available (exception will be thrown during Class.forName()). Java Class file contains a constant pool where all reference to external classes are located along with other constants (i.e. strings, initializer values). Since all the references are so nicely situated, JVM implementations typically enumerate thru the constant table and resolve any class references there when the class is loaded without checking to see if the class is actually used. When Class.forName() is used, you will have the string constant but not the class reference entry. Comes in handy in tight places. Regards, Don Park http://www.docuverse.com/personal/index.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Sat Apr 4 06:27:37 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:00:24 2004 Subject: SAX: Java-Specific Question References: <199804040144.UAA00822@unready.microstar.com> Message-ID: <3525B720.62B01DD9@infinet.com> David Megginson wrote: > Here's a Java question. Let's say that I have a class with a method > > public void parse (String publicId, String systemId, Reader reader) > throws java.lang.Exception; > > What will happen if the "java.io.Reader" class is not available on my > system (perhaps because I'm using a 1.0.2 browser), but I never invoke > this method (I'm assuming that I compiled my code under JDK 1.1 or JDK > 1.2)? Well this deals with the issue of backward compatibility. Classes like java.awt.Component may elect to add new methods, but they must still support the old ones. Having an additional public void parse(String publicID, String systemID, Reader reader) would not be a problem for old 1.0.2 programs at run time since the JVM only loads classes (new ones) as it needs them. So if you never invoke a method with a reader class, then the reader class will never ever get a chance to try and be initialized (which is good cause otherwise you would get a ClassNotFoundException). An example of this is with the AWT. With JDK 1.1 a lot of new methods were added to java.awt.Component (a bunch more are being added in JDK 1.2 as well). If I have a compiled applet for JDK 1.1, it should work in a JDK 1.2 environment because none of the JDK 1.1 methods were removed, only new ones were added. > I'm becoming convinced that we need to support character streams in > SAX, and I'm trying to figure out how to handle it in the Java > version -- I think that I've worked out pretty much everything else > now, except for a few minor details. It is really not a problem to do what you are talking about from a compatibility standpoint and I have wondered why this was not done earlier. One major question I have is regarding character support in Java. If you look at the XML spec, there are a lot of differences between the tables of what is whitespace, legal unicode characters, etc. which make it so parser writers need to write their own isWhitespace(char c) functions instead of using Character.isWhitespace(). In this particular case, Java sees a Form Feed as white space while the XML spec does not. Does anyone have any ideas as to why there are so many inconsistencies? Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Sat Apr 4 06:31:05 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:00:24 2004 Subject: Java-Specific Question -- CORRECTION Message-ID: <000a01bd5f81$a548bf30$2ee044c6@donpark> This is a correction of my answer to David's question which is: >>Here's a Java question. Let's say that I have a class with a method >> >> public void parse (String publicId, String systemId, Reader reader) >> throws java.lang.Exception; >> >>What will happen if the "java.io.Reader" class is not available on my >>system (perhaps because I'm using a 1.0.2 browser), but I never invoke >>this method (I'm assuming that I compiled my code under JDK 1.1 or JDK >>1.2)? I answered: >ClassNotFoundException will be thrown when the class is loaded. What you This is wrong. David reported test results to the contrary so I went back to the VM spec and found that my understanding is either obsolete or wrong to begin with. Since the last thing I want to do is misinform, here is the correct answer: When a Java class is loaded, only the superclass and interfaces it implements are 'resolved', meaning that they are loaded as well. The class is then 'verified'. Verification of a class does not include checking whether referenced classes, methods, fields actually exists. After verification, the class and its superclasses are 'initialized' and executed. During execution, referenced classes are 'loaded' upon 'use'. This means that uninvoked methods can reference any class they want to as long as they are available during compilation process. Same is true for classes referenced in unexecuted code path. Therefore, following code could be used to write Reader aware SAX-client which can run under Java 1.0: try { // if the exception is not thrown, the class is available so use the // Reader based method to parse. parser.parseReader(new java.io.InputStreamReader(in)); } catch (ClassNotFoundException ex) { // InputStreamReader is missing so use the stream-based method. parser.parseStream(in); } What I just described might not be true under just-in-time compilers and server-side Java compilers depending on the compilation strategy involved. I hope this corrects any misunderstanding due to my evil-twin's activities ;-p Regards, Don Park http://www.docuverse.com/personal/index.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecom.mixx.de Sat Apr 4 10:32:43 1998 From: James.Anderson at mecom.mixx.de (james anderson) Date: Mon Jun 7 17:00:24 2004 Subject: why do namespaces have such a bad rep [4] References: <199803301710.MAA01121@unready.microstar.com> <199803312127.QAA00299@unready.microstar.com> <199803312343.SAA01404@bruno.techno.com> <3521BC55.6014F2E7@mecom.mixx.de> <199804011625.LAA00802@bruno.techno.com> <35230A09.58739027@mecom.mixx.de> <199804021955.OAA00839@bruno.techno.com> <35244059.545B230B@mecom.mixx.de> <199804031742.MAA01338@bruno.techno.com> Message-ID: <3525F066.4E9544E8@mecom.mixx.de> with consistent and complete support for namespaces, the identification of token names would not be "automagical", you would have the means to express the correspondence. that's why i've kept my coffee table: when we're already busy with the coffee and cake, and the next guests show up with the champagne, i don't need three arms. Steven R. Newcomb wrote: > [James Anderson :] > > > i'm concerned about the case were i want to get at [the base > > architecture's parameter entities], not be isolated from them. i > > would expect to have to do something like the following. suppose > > that i have two base architectures, each with an element "name". i > > would like to derive an architecture to comprise descriptions of > > both sorts. nothing new. > > > > > > > > address-form (PSEUDONYM | LEGAL) "LEGAL"> > > > > > > > > > address-form (ASSOCIATIVE | LITERAL) "LITERAL"> > > > while the provisions of enabling architectures make it possible to > > use architectural form attributes to disambiguate the "name" element > > names by creating a 2-d namespace for element names, and to > > disambiguate the "address-form" attribute names by using the > > remapper attribute to, likewise, create a 2-d namespace for > > attribute names, and then to map them onto unique positions within a > > 1-d namespace, i have not found the provision to enable declaring > > distinct values for the entities. how would one distinguish > > %name-content from %name-content in a derived architecture in order > > to specify the respective entity declarations? > > > maybe i don't need to. i thought i would. > > I don't think you need to. When you create an element subtype from an > element type in a base architecture (a supertype), it doesn't matter > whether the content model of the supertype was expressed by means of > one or more parameter entities. For all purposes of subtyping, it > only matters what the replacement text of those parameter entities > was. I can see where it might sometimes be convenient to re-use the > same parameter entities that were used in the base architecture, but > there is no provision in the current enabling architectures syntax for > that, and it's not necessary, anyway. I doubt it would be worth the > added syntactic complexity. You would have to cause the names in the > replacement texts of the supertype's architecture's parameter entities > to be somehow automagically translated into the corresponding names in > the subtyping architecture. > > [I'm trying to use the "subtype/supertype" vocabulary instead of the > "inheriting/inherited" vocabulary here; does it work better?] > > -Steve > > -- > Steven R. Newcomb, President, TechnoTeacher, Inc. > srn@techno.com http://www.techno.com ftp.techno.com > > voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137) > fax +1 972 994 0087 (at ISOGEN: +1 214 953 3152) > > 3615 Tanner Lane > Richardson, Texas 75082-2618 USA xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From amarshal at usc.edu Sat Apr 4 11:11:19 1998 From: amarshal at usc.edu (Andrew n marshall) Date: Mon Jun 7 17:00:24 2004 Subject: Java-Specific Question Message-ID: <01BD5F67.88324CF0.amarshal@usc.edu> On Friday, April 03, 1998 5:44 PM, David Megginson [SMTP:ak117@freenet.carleton.ca] wrote: > Here's a Java question. Let's say that I have a class with a method > > public void parse (String publicId, String systemId, Reader reader) > throws java.lang.Exception; > > What will happen if the "java.io.Reader" class is not available on my > system (perhaps because I'm using a 1.0.2 browser), but I never invoke > this method (I'm assuming that I compiled my code under JDK 1.1 or JDK > 1.2)? The JavaVM will throw a ClassNotFoundException because it will try to preload the class in case you do use the method. > I'm becoming convinced that we need to support character streams in > SAX, and I'm trying to figure out how to handle it in the Java > version -- I think that I've worked out pretty much everything else > now, except for a few minor details. I'm dealing with a similar problem on an applet at work where I need to be able to support the Java Media Framework for sound if available. My solution is to abstract everything that references the possibly missing classes into a second class, then in build an initialization routine like the following: Class parserClass; // used if creating several Parser objects Parser parser = null; // Parser is an abstract class or interface // Used if only need one parser void init() { try { Class.forName( "java.io.Reader" ); // checks if supported parserClass = Class.forName( "MyParserFor11" ); } catch( ClassNotFoundException err ) { parserClass = Class.forName( "MyParserFor10" ); } try { parser = (Parser) parserClass.newInstance(); } catch( InstantiationException err ) { // Constructor requires argument } catch( IllegalAccessException err ) { // Constructor isn't public } } NOTE: Apparently my understandings reflected Don Park's >>original<< message which is apparently wrong. But even so, the above will still work. Andrew n marshall student - artist - programmer http://www.media-electronica.com/anm-bin/anm "Everyone a mentor, Everyone a pupil" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecom.mixx.de Sat Apr 4 15:06:33 1998 From: James.Anderson at mecom.mixx.de (james anderson) Date: Mon Jun 7 17:00:24 2004 Subject: Inheritance and other buzzwords References: <3.0.1.16.19980403085207.37573c76@pop3.demon.co.uk> Message-ID: <35263067.954C460E@mecom.mixx.de> hello again, there were some things in a remark yesterday from Rick Jelliffe and in a note from Peter Murray-Rust which lead me to doubt that wd-xml-names will work unless it is extended to more completely specify the semantics of the namespace declaration pi. in previous remarks, it has been explained that the wd expressly avoids specifying a semantics. (andrewl@microsoft.com/RE: Namespaces in XML: 3.1 the example [2]/Tue, 31 Mar 1998 19:01:05 -0800) that it says, that the pi has a scope over the document in the prologue of which it appears does not suffice. this still leaves room to create documents which have conflicting and incompatible interpretations. while there need be no requirement that the processor read an external schema (should the SrcDef have been specified), there should be some indication of what effect the ns-declaration has on unqualified names in the schema should the schema be read. there are two possible interpretations to "scope". in one aspect (1) the alternative is to either (a) affect the region of the articulated namespace into which unqualified names which appear in the external entity specified by the pi are mapped or (b) not. in the other aspect (2), the alternative is to either (a) implicitly specify the region of the articulated namespace into unqualified names in the same entity in which the pi appears are mapped or (b) not. the difference in effect can be understood with reference to, mr. jelliffe's attributed quote. he paraphrases, that articulated namespaces 'allow the RDF people to say "this element is one of our element types".' i understand this to assume a semantics which combines 1b and 2a. the alternative, which would be the attributed assertion that articulated namespaces 'allow the people to say "this element is one of the element types which the RDF people say is theirs" combines 1a and 2b. this second semantics makes additional remapping unnecessary. documents created for processors which implement the first semantics would be incompatible with documents created for processors which support the second semantics. as such the wd needs to be expanded to say which of these semantics apply to the interpretation of external entities in the event that there is a schema at the other end of the SrcDef. if not syntactically, then semantically. in order to avoid the problems noted below and in other places, the interpretation should best be along the lines of 2b. under that semantic one would, to take mr. jelliffe's example, promote dtd content such as and and the key is that the semantic is not (with reference to Jelliffe/Re: Inheritance and other buzzwords/Fri, 3 Apr 1998 16:00:41 +1000) prefix->ns; then ns->schema but prefix -> (localPart -> (schema -> definition)) which, in order words, means that, (within the scope of a given ns-declaration) given the prefix, one can determine a region of the universal namespace, from which given the local part one can determine an identifier which can be used within the context of a given schema to name a definition. (nb. that we're dealing with 3-d namespaces - i.e. the namespace articulation implicit in entity v/s p-entity v/s element v/s etc, is ignored here.) my specific suggestion (above, the semantic called 1a/2b) is, a. that unqualified names should be assumed to have the qualifier presently bound to their physical entity by a ns-declaration. b. the standard define how, in the absence of a ns declaration (eg. a dtd referenced from a document with a doctype only), a ns-enabled processor is to generate an implicit ns declaration. c. explicitly qualified names which incorporate the prefix bound by a ns declaration within an external to the external entity itself should be considered 'unqualified' for the purpose of item 'a' above. for reference purposes, here the two remarks: Rick Jelliffe wrote: > From: Tim Bray > > > The current namespace proposal adds one level of indirection to the names > > we give document components, and includes a technique for ensuring that > > the names are unique across the universe of the Internet. That's all! > > I think Tim is correct in trying to limit people's perception of what the > namespace > proposal does. The basic requirement is, more or less, to have a declaration > which is as simple as possible, as non-intrusive as possible, and which > does not require explicit element or attribute declarations, which will > allow > the RDF people to say "this element is one of our element types". > > Whether there is, underlying this, some more interesting structure of links > derived from type names not instances (my belief), or an underlying > honeycomb of parallel, mutually augmenting schemas (the architectures > idea) should not be the deciding factor for the namespace proposal, to > me. I think it is enough that the namespace 1.0 be expressed in a way that > does not rule out an interpretation using either ot these mechanisms (or > others) is enough at this stage. > > It was not Tim's point, but strictly I think the proposal adds two levels of > indirection: > prefix->ns; then ns->schema > It is because of this indirection that the current proposal does not ensure > unique > naming: there is a possibility of an error where two fragments using the > same > prefix are combined under the same namespace declaration. This is > particularly > an issue of maintenance: where a schema is updated, perhaps to make it have > a more restrictive content model. > > So XML tools which combine fragments with namspaces will have to be able to > rewrite name prefixes. If there is > > and > > then the processing tool will have to be smart enough to, for example, > relabel > the second PI > > and all the names which use this must be relabelled too in the instance. Peter Murray-Rust wrote: > [... discussion of the problems inherent in getting along without articulated > namespaces deleted ...] > > My problem only comes when I encounter the Concrete Materials Laboratory > who also use CML as their prefix. If I want to mix existing document from > CML and CML I have to edit one set. Tedious. But we do this sort of thing > every day for other reasons (how many of you have run automatic edits > through documents when companies' names change, etc.). No big deal. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Patrice.Bonhomme at loria.fr Sat Apr 4 19:15:04 1998 From: Patrice.Bonhomme at loria.fr (Patrice Bonhomme) Date: Mon Jun 7 17:00:24 2004 Subject: AttValue within XPointer ? Message-ID: <199804041714.TAA13654@chimay.loria.fr> I had a look at the brand new XML XPointer specification and something is not clear to me. Some of the XPointer examples are not conformant to the XML Language specification. I am not sure that an attribute value can start with a digit. And some examples are: child(1,#element,N,2).(1,#element,N,1) ancestor(1,#element,N,1).(1,DIV) etc... I think they should be written (with the attribute value as a "SkipLit"): child(1,#element,N,"2").(1,#element,N,"1") ancestor(1,#element,N,"1").(1,DIV) Could the editors of the XPtr Spec. precise the definition of the NAME production. Thanks a lot. Pat. -- ============================================================== bonhomme@loria.fr | Office : B.228 http://www.loria.fr/~bonhomme | Phone : 03 83 59 30 52 -------------------------------------------------------------- * Serveur Silfide : http://www.loria.fr/projets/Silfide * Projet Aquarelle : http://aqua.inria.fr ============================================================== xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Sat Apr 4 19:18:27 1998 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:00:24 2004 Subject: xml tutorial Message-ID: <01bd6007$36fe1260$8aaedccf@uspppBckman> I put up a fairly basic XML tutorial at http://www.hypermedic.com/style/xml/xmlindex.htm It is just in a text version at present. Frank Boumphrey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcl at micapeak.com Sat Apr 4 20:29:03 1998 From: marcl at micapeak.com (H. Marc Lewis) Date: Mon Jun 7 17:00:24 2004 Subject: unsubscribe Message-ID: <3.0.32.19980404102754.006a6280@alutia.micapeak.com> Unless you're the list moderator, please skip this posting. WILL THIS LIST MODERATOR PLEASE UNSUBSCRIBE ME (marcl@micapeak.com)? I resubscribed from my work account, which is more appropriate. But when I try to unsubscribe my original subscription, the list server tells me I'm not subscribed, even though it keeps sending me the list postings. I'm sure the list server software is simply confused about who I am. I tried repeatedly and unsuccessfully to unsubscribe, and well as sending a request directly to Henry Rzepa (rzepa@ic.ac.uk), but so far no joy... ----------------------------------------------------------------------------- H. Marc Lewis -- marcl@micapeak.com -- http://www.micapeak.com/~marcl/ at Wall Data, Inc. in Spokane: mlewis@walldata.com -- (509) 892-3400 ----------------------------------------------------------------------------- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Sun Apr 5 07:28:49 1998 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:00:24 2004 Subject: Comments on Section 2.6 of XML-Namespaces Message-ID: <5BF896CAFE8DD111812400805F1991F701C90F38@red-msg-08.dns.microsoft.com> The points you raise were addressed during the discussion leading up to the current paper. Briefly, the namespaces facility does not address how "global" attributes are to be declared. It says merely that if a single attribute is used in two different element types, it may use namespace qualification to represent that it is in fact the same attribute, not two attributes with equivocal names. Nor does the namespace facility express an opinion on when such use is reasonable. As you note, the XML-Link proposal has found it useful. I hope this summary is useful. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Sun Apr 5 07:31:39 1998 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:00:25 2004 Subject: why do namespaces have such a bad rep [2] Message-ID: <5BF896CAFE8DD111812400805F1991F701C90F39@red-msg-08.dns.microsoft.com> Namespaces do not imply inheritance. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Sun Apr 5 07:50:03 1998 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:00:25 2004 Subject: namespaces Message-ID: <5BF896CAFE8DD111812400805F1991F701C90F3A@red-msg-08.dns.microsoft.com> please forgive the no-caps typing and brief reply. i have a baby in my other hand. several recent mails have read more into the namespaces proposal than is written in the paper, and have then argued against what the writer imagined the paper to say. i know it is easy to read a paper and to think that it implies all sorts of gigantic things. namespaces is a very simple, minimal thing. it isn't architectures. it isn't object orientation. it isn't a mechanism for document composition. it isn't a new way to write DTDs. it is only what the paper says: a means to associate a name with a controlled namespace. best wishes, andrew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sun Apr 5 22:31:36 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:00:25 2004 Subject: Expat released Message-ID: <3527E959.98B810DD@jclark.com> I've released a new version of my XML parser in C, previously called as xmltok, but now renamed expat (EXtensible markup language PArser Toolkit). See http://www.jclark.com/xml/expat.html for more information. This is the parser that is being used to add support for XML to Netscape Navigator 5 and Perl. I've switched to using the new Mozilla Public License for this release and future releases. Note that if you have the previoous release, you can continue to use it according to the license under which it was released. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecom.mixx.de Sun Apr 5 23:02:14 1998 From: James.Anderson at mecom.mixx.de (james anderson) Date: Mon Jun 7 17:00:25 2004 Subject: Comments on Section 2.6 of XML-Namespaces References: <5BF896CAFE8DD111812400805F1991F701C90F38@red-msg-08.dns.microsoft.com> Message-ID: <3527F18B.57892AD7@mecom.mixx.de> Andrew Layman wrote: > The points you raise were addressed during the discussion leading up to the > current paper. Briefly, the namespaces facility does not address how > "global" attributes are to be declared. It says merely that if a single > attribute is used in two different element types, it may use namespace > qualification to represent that it is in fact the same attribute, not two > attributes with equivocal names. > > Nor does the namespace facility express an opinion on when such use is > reasonable. As you note, the XML-Link proposal has found it useful. > > I hope this summary is useful. not yet. as it stands, this last statement, especially, confuses me more than it helps. in the subjunktive it would be ok. the xml-link document itself proposes to contend with ambiguous attribute names through name remapping. (at least in the released version "www.w3.org/TR/WD-xlink#remapping".) for which one does not need "namespaces". (or rather attribute remapping is itself a form of ideosyncratic support for namespaces (please see the thread on namespaces and their bad rep)). with support for namespaces (even the form of conventional "namespace" invoked with comments and "dotted" names, that is the form where the element/attribute definer gets to say what is "theirs") you do not need to map attributes. according to that way of working, if the application element uses the qualified element name, then that's what it means. (nb. i do not claim this form of namespace support is sufficient, i describe it to illustrate the point only.) the example from WD-xlink would simply look like: you don't need a pi in the application document for this, you just need some comments in the source schema and a registry in which authors stake claims to the names. while xml-link would benefit from namespaces, it doesn't yet appear to show it. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Mon Apr 6 04:51:46 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:00:25 2004 Subject: Inheritance and other buzzwords Message-ID: <002a01bd6107$2361cfc0$aa0b4ccb@NT.JELLIFFE.COM.AU> From: james anderson > there were some things in a remark yesterday from Rick Jelliffe and in a note > from Peter Murray-Rust which lead me to doubt that wd-xml-names will work unless > it is extended to more completely specify the semantics of the namespace > declaration pi. I think the wd-namespace will work as advertised, but I think a lot of people will try to use it for more than the uses that it pupports to address. Following are some general comments. > in previous remarks, it has been explained that the wd expressly avoids > specifying a semantics. 1) Some people claim that the difference between a hyperlink and an entity declaration/reference is that there is something voluntary or contingent about a hyperlink while an entity declaration/reference expresses a more fixed and necessary relationship. If we accept that for a second, then I think Andrew's comments are clearer: the schema nominated in the namespace declaration is not "specified" rather it is "identified". So namespace PI is more like a hypertext link rather than an entity reference. Some people think that, in the long run, document-specific type structures will be declared (using XML-data or XML markup declarations) in the document, while external-vocabulary schema fragments will be defined externally, invoked using the namspace PI. (I think it is more likely that names borrowed from HTML schema will be unprefixed, while all others will be.) In that case, namespace PIs act more like a kind external entity reference--more like the ISO "module" proposal. 2) I think the term "scope" shouldn't be used here: all namespace PIs have scope over the entire document, not over particular entities. And an element type name without a prefix has no binding to a schema using the namespace mechanism (it still could use the standard XML markup declarations, or architectural forms, or ICADD fixed attributes, or other home-made systems though.) 3) My point about namespace PIs having the same prefix is merely that without the namespace PI there is the possibility of a clash with every name. The namespace PI does not guarantee name uniqueness accross all documents, it merely decreases the number of clashes. This is pretty weak in my book, but it is workable provided developers understand it and write their software accordingly. With the wd-namespace PI the number of possible things that can clash is reduced to the prefixes themselves and to unqualified names. And the prefixes give corporate and project names, not common element names: it is very easy to predict that two documents will have incomptible element types named "table", not so many documents will have incompatible namespaces called "rdf" (except for the versioning issue mentioned previously). Of course, it would be much better to have "w3.org-rdf" as the prefix, since that would introduce some discipline. Certainly if I ever use namespace PIs I would try to make sure that my prefix had some robustness to it, by including an organization name in it. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecom.mixx.de Mon Apr 6 11:57:48 1998 From: James.Anderson at mecom.mixx.de (james anderson) Date: Mon Jun 7 17:00:25 2004 Subject: Inheritance and other buzzwords References: <002a01bd6107$2361cfc0$aa0b4ccb@NT.JELLIFFE.COM.AU> Message-ID: <3528A746.3B5F1B15@mecom.mixx.de> Rick Jelliffe wrote: > I think the wd-namespace will work as advertised, but I think a lot of > people > will try to use it for more than the uses that it pupports to address. the issue is that, if completely specified, it would accomplish the things which you, yourself say need to be accomplished. one would not need to "leave them to he programmer". if not, then inconsistent interpretations will arise and the problems, which you note, will prosper and flourish. > Following are > some general comments. > > > in previous remarks, it has been explained that the wd expressly avoids > > specifying a semantics. > > 1) Some people claim that the difference between a hyperlink and an entity > declaration/reference is that there is something voluntary or contingent > about a hyperlink while an entity declaration/reference expresses a > more fixed and necessary relationship. If we accept that for a second, > then I think Andrew's comments are clearer: the schema nominated > in the namespace declaration is not "specified" rather it is "identified". since all of this only matters anyway in the context of validation and entity/attribute defaults (in general, when the intent is to ascribe a behaviour which is specified elsewhere - otherwise you could go ahead and name your entities anything) the only time it matters whether you have an unambiguous name, or not, is when the intent of the "identification" is to "specify". > ... > > So namespace PI is more like a hypertext link rather than an entity > reference. which link has to be guaranteed to lead to an unique location should you follow it. > ... > > 2) I think the term "scope" shouldn't be used here: all namespace PIs have > scope over the entire document, not over particular entities. that's the probem. if they have indefinite scope, then you get ambiguities. if they have dynamic scope, then you don't. to be more complete:1. the schema identified as a namespace source has indefinite extent. that is the entities (elements, attributes, pe's(? not sure), (ge's), notations, ...) defined within it are valid within the process which references the schema. 2. the namespace name (the binding of the name to the namespace-region of schema) has indefinite extent. (ie. the universal names are universal - and not just within the document, but within the process) 3. the prefixes name (the binding of the prefix to the namespace region of the schema) has dynamic extent. that is, within its physical entity and any entities referenced from there. > And an element > type name without a prefix has no binding to a schema using the namespace > mechanism (it still could use the standard XML markup declarations, or > architectural forms, or ICADD fixed attributes, or other home-made systems > though.) then it has a binding. that it is implicit does not make it go away. > > > 3)... > > With the wd-namespace PI the number of possible things that can clash > is reduced to the prefixes themselves and to unqualified names. with the correct scoping and default rules, it should be possible to eliminate these clashes. [does anyone have a denotational definition for xml wrt the dom? it would make a discussion like this so much easier...] bye for now, xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From john at datachannel.com Mon Apr 6 21:51:02 1998 From: john at datachannel.com (John Tigue) Date: Mon Jun 7 17:00:25 2004 Subject: Controlling link destination DOCTYPE Message-ID: <019701bd6195$defbfec0$720a1bac@gigue.datachannel.com> The below document asks the question "how do you tell the processor that the destination of a link must be a document complying to DTD X?" ]> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Mon Apr 6 22:31:35 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:00:25 2004 Subject: Controlling link destination DOCTYPE Message-ID: <3.0.32.19980406152958.00727844@swbell.net> At 12:54 PM 4/6/98 -0700, John Tigue wrote: > Is there any way (XML, HyTime, etc.) to declare that an ENTITY > attribute is not only an ENTITY attribute but that the value > must be of a particular declared NOTATION? That is, the > entity must be declared to have a particular NDATA? Or is this > simply application level constraints not parser level? HyTime provides a partial solution: reference type control. Very simply, you can associate referential attributes (e.g., an ENTITY attribute) with a list of element types that an attribute can address. You use the "reftype" attribute, like so: This simply says "the element addressed by the docLinkingTo attribute must be a myBarDoc element". Note that, by HyTime (and XPointer) rules, a reference to a document is shorthand for a reference to its root element. [the HyTime and HyNames attributes are there to complete the architectural mapping from the element to the clink element type defined by the HyTime architecture--they enable a HyTime-aware processor to recognize that the reftype attribute is the one defined by HyTime.] However, this doesn't really solve John's problem, which is to constrain the reference to documents that use a particular set of element type declarations, not the element type of the document element (which could vary depending unless the declarations declare exactly one element type). There is no HyTime facility for doing this (although perhaps there should be). However, you could define your own application convention for doing it: Where "doctype-constraint" is like reftype except that the second parameter is the public ID of the external DTD subset of the document referenced by the docLinkingTo attribute, not the element type. Of course, the external declaration subset is not reliable as no document need have one (even if it uses the same declarations explicitly by copying them into the internal subset or using a normal external parameter entity). What you really need is an "architecture use constraint" that requires that the referenced document be derived from a specific SGML architecture: In both cases, the constraint checking can be easily implemented using any processor with access to the DOCTYPE properties or architecture use declaration (either in PI form, as would be used for XML documents, or notation declaration form, as can be used with SGML documents). Even if not implemented, it conveys the author's intent about the constraints on the reference. In neither case would the parser enforce these constraints--these are processor-specific semantic constraints, not parse-time constraints (in other words, you parse the document and then process it to resolve links and addresses, at which time you would check and enforce these constraints. Cheers, Eliot --
W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rtennant at library.berkeley.edu Tue Apr 7 00:51:07 1998 From: rtennant at library.berkeley.edu (Roy Tennant) Date: Mon Jun 7 17:00:25 2004 Subject: When is an attribute an attribute? Message-ID: I've been trying to figure this out for a while with no success. It seems to me that there are several quite different ways one can encode information in XML. Are all of the following correct? When and why would you choose one over another? Does it matter? Thank you for your indulgence as I puzzle out what must surely be readily apparent to most of you. Example 1: --------- Example 2: --------- The Call of the Wild Example 3: --------- The Call of the Wild London, Jack Thanks, Roy Tennant xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From daniela at cnet.com Tue Apr 7 01:13:23 1998 From: daniela at cnet.com (Daniel B. Austin) Date: Mon Jun 7 17:00:25 2004 Subject: When is an attribute an attribute? In-Reply-To: Message-ID: <199804062312.QAA10316@central.cnet.com> Hi, All three of your examples below are well-formed. The decision as to whether properties of document objects are to be encoded as attributes or as element content is up to you; there is no clear cut answer. (You might note that your example #2 below doe not provide as much information as examples #1 & 3, because it does not specify that the element's content is a title...it could be anything.) Here are some considerations that may inform your decisions regarding attributes and elements: a) does the document property relate to the structure of the document? If yes then an element would provide better use. b) are your target documents going to be large in terms of file size? If so, an attribute might be a better choice. c) is the processor/display device you are using better or faster at parsing one or the other? d) does the property apply to many elements in your document? ie. in book.xml the title might only show up once, or once at the bottom of each page. e) Does the author find it easier to add an element or an attribute or does it matter? In general I would make the case that properties that are used often and are non-structural in nature would be best defined as attributes and others as elements. Regards, D- At 03:51 PM 4/6/98 -0700, you wrote: >I've been trying to figure this out for a while with no success. It seems >to me that there are several quite different ways one can encode >information in XML. Are all of the following correct? When and why would >you choose one over another? Does it matter? Thank you for your indulgence >as I puzzle out what must surely be readily apparent to most of you. > >Example 1: >--------- > > > >Example 2: >--------- > >The Call of the Wild > >Example 3: >--------- > > > The Call of the Wild > London, Jack > > >Thanks, >Roy Tennant Daniel Austin daniela@cnet.com Director of Development, Corporate Creative Services CNET: The Computer Network (415) 395-7800 x1438 "To change the old into the new, and the shapes of things to come..." xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murray at muzmo.com Tue Apr 7 01:15:50 1998 From: murray at muzmo.com (Murray Maloney) Date: Mon Jun 7 17:00:25 2004 Subject: When is an attribute an attribute? In-Reply-To: Message-ID: <3.0.1.32.19980406191209.0070f5b0@pop.uunet.ca> There is no real "right" way to encode something like your example. Any of the examples that you have offered is just as likely as the other, and an application that works with any of them is just as likely to succeed in meeting its objectives. However, if you wanted to distinguish between a family and given name, and maybe add an honorific or an accreditation, you might want to use an element with subelements for the author. Using a comma in the name requires a second-level parse. An advantage of using nested subelements is that you can avoid a second level parse. Otherwise, as I said, there is no "right" answer. At 06:51 PM 4/6/98 -0400, Roy Tennant wrote: >I've been trying to figure this out for a while with no success. It seems >to me that there are several quite different ways one can encode >information in XML. Are all of the following correct? When and why would >you choose one over another? Does it matter? Thank you for your indulgence >as I puzzle out what must surely be readily apparent to most of you. > >Example 1: >--------- > > > >Example 2: >--------- > >The Call of the Wild > >Example 3: >--------- > > > The Call of the Wild > London, Jack > > >Thanks, >Roy Tennant > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Murray Maloney Email: murray@muzmo.com Technical Director Phone: (905) 509-9120 Veo Systems Fax: (905) 509-8637 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Make a Tax-Deductible Donation Yuri Rubinsky Insight Foundation http://www.yuri.org/donate.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pierlou at CAM.ORG Tue Apr 7 02:33:25 1998 From: pierlou at CAM.ORG (Pierre Morel) Date: Mon Jun 7 17:00:25 2004 Subject: Announcement: A DTD/XML editor / viewer Message-ID: <01bd61bc$2fc24a50$01dcdcdc@pc-010> Take a look at a JAVA DTD/XML editor. An alpha version ( free of course ) is available for download. I wish it can be useful. It come in two languages ( english and french ). Can easily be customise to any other language. Give it a try. Visual XML is himself an XML application. Can be download at: http://www.pierlou.com/visxml I am waiting for your comments, corrections and suggestions. Truly, Pierre Morel Visual XML home: http://www.pierlou.com/visxml Proto home: http://www.pierlou.com/prototype -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19980407/f1e5f572/attachment.htm From bckman at ix.netcom.com Tue Apr 7 02:35:23 1998 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:00:25 2004 Subject: When is an attribute an attribute? Message-ID: <01bd61d6$94f016e0$23addccf@uspppBckman> Apart from the wrong type of slash in example 1, a typo I'm sure, they are all legal in XML. Which you use ,I guess is up to you. Frank -----Original Message----- From: Roy Tennant To: xml-dev@ic.ac.uk Date: Monday, April 06, 1998 3:54 PM Subject: When is an attribute an attribute? >I've been trying to figure this out for a while with no success. It seems >to me that there are several quite different ways one can encode >information in XML. Are all of the following correct? When and why would >you choose one over another? Does it matter? Thank you for your indulgence >as I puzzle out what must surely be readily apparent to most of you. > >Example 1: >--------- > > > >Example 2: >--------- > >The Call of the Wild > >Example 3: >--------- > > > The Call of the Wild > London, Jack > > >Thanks, >Roy Tennant > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Tue Apr 7 02:38:36 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:00:25 2004 Subject: When is an attribute an attribute? References: Message-ID: <352974B8.2964@hiwaay.net> Roy Tennant wrote: > > I've been trying to figure this out for a while with no success. It seems > to me that there are several quite different ways one can encode > information in XML. Are all of the following correct? Yes. > When and why would > you choose one over another? Does it matter? Thank you for your indulgence > as I puzzle out what must surely be readily apparent to most of you. Ok, a DTD really helps this sort of discussion along, but FWIW: > Example 1: > --------- > > Use empty elements and attributes for tag bags, basically, if the datum has no frequency and order requirements (only occurs once somewhere in the attribute list). NOTE: I haven't looked to see if XML dropped the SGML restriction on repeated values in attlist decls. > Example 2: > --------- > > The Call of the Wild Use this if you don't care that the string inside the tags is only differentiated by the BOOK, that is, semantically, there is no difference between this and Love that Wolf!! or IOW, your application has to know that is a title. > Example 3: > --------- > > > The Call of the Wild > London, Jack > Use this when it is important to know there is a title and author (i.e, this BOOK HAS-A TITLE, HAS-A AUTHOR; the string, The CALL of the WILD IS-A TITLE). Given the element type declaration, you can tell which order they should come in, are there multiple authors, are there alternate titles, etc. The semantic is application dependent. For a linking semantic, you might be counting nodes inside the BOOK. For rendering, you might be assigning the font value based on the context of the book element. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jamsden at us.ibm.com Tue Apr 7 02:48:36 1998 From: jamsden at us.ibm.com (Jim Amsden) Date: Mon Jun 7 17:00:26 2004 Subject: When is an attribute an attribute? Message-ID: <5040100016970517000002L072*@MHS> I think it's best to treat this as an object modeling problem first, and then an XML representation. The distinction between attribute and content element then becomes the distinction between an attribute and a containment relationship with another object. Object attributes are atomic, referentially transparant characteristics of an object that have no identity of their own. Generally this corresponds to primitive data types, but this can be somewhat arbitrary too (e.g., Strings, Date, etc.). Taking a more logical view, an attribute names some characteristic of an object that models part of its internal state, and is not considered an object in its own right. That is, no other objects have relationships to an attribute of an object, but rather to the object itself. So if the thing you want to capture has internal structure of its own, or can be referenced through a link, or can be contained in more than one element, then its an element, otherwise it's probably an attribute. Note that attributes have a numer of advantages over content elements: 1. they can have names that indicate the role the value plays in the element. Element contents have content names, but there is no way to say what role the content plays in any particular element that contains it. 2. attributes can have default values. 3. attributes have (minimal) data types 4. attributes take up less space as there is no need for an end tag 5. attributes are easier to access in DOM. There are also some disadvantages: 1. attributes aren't as convenient for large values, or binary entities. 2. values containing quotes can be a bother. 3. attributes can't contain other elements. This isn't really a disadvantage, but part of what it means to be atomic. 4. white space can't be ignored in an attribute. My recommendation is to use attributes unless you can't, and certainly use them to avoid mixed data content in elements whenever possible. The idea is to encapsulate as much as you can in an individual object but not too much. Use the principles of data normalization, they work fine here too. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Tue Apr 7 03:12:47 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:00:26 2004 Subject: When is an attribute an attribute? Message-ID: <005101bd61c2$7e02bbe0$a30b4ccb@NT.JELLIFFE.COM.AU> From: Jim Amsden > I think it's best to treat this as an object modeling problem first, and then > an XML representation Without going against object modeling or any other view, you should first be aware of any constraints in XML (Len's comment) and in your immediate software. If your editing software makes attributes easy, then use attributes. If your rendering/draft software does not support attributes well, use elements. You can do a simple translation of your DTD and document to convert from one form to another anyway. A DTD does not need to be set in stone. A markup language has to reconcile data modeling needs and human factors, with the latter being the most important. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From barrym at halcyon.com Tue Apr 7 03:32:19 1998 From: barrym at halcyon.com (Barry MacKichan) Date: Mon Jun 7 17:00:26 2004 Subject: msxsl bug? Message-ID: <002101bd61c4$ce445170$332842ce@software_xlnc> I've been working with Microsoft's msxsl program, to apply their xsl style sheets to xml documents to produce an html file. Is the following a bug, or do I not understand xsl? My xml file is: Line 1 Line 2 Line 3 Another Line 2 Another Line 1 My xsl file is using the element to select the and elements first, and then to select the elements. The problem is that apparently I can select only the elements OR the elements, but not both. I am inferring that it is possible to select multiple kinds of elements, from the way selection for rules works. If it is not possible to select multiple kinds of elements, then Microsoft's claim that you can use xsl to re-order the children of an element seems pretty weak: you can only do it when there are only two kinds of children in an element. My xsl file is: //***** here is the problem ****** //***** end of problem area

I would expect output to look like: Line 1 Line 2 Another Line 2 Another Line 1 Line 3 but I get Line 1 Another Line 1 Line 3 That is, the select-elements with multiple criteria ignores all the criteria but the last. Can anybody set me straight? TIA --Barry MacKichan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Tue Apr 7 05:21:06 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:26 2004 Subject: When is an attribute an attribute? In-Reply-To: References: Message-ID: <199804070320.UAA00399@unready.microstar.com> Roy Tennant writes: > I've been trying to figure this out for a while with no success. It > seems to me that there are several quite different ways one can > encode information in XML. Are all of the following correct? When > and why would you choose one over another? Does it matter? Thank > you for your indulgence as I puzzle out what must surely be readily > apparent to most of you. It's not self-evident, and everyone has their own strongly-held opinions. Database people are tempted to force everything into attributes, because attributes are (slightly) typed while character data is not. Generally, though, you need to consider the following: - attribute values are harder to search for in search engines - attribute values often don't appear on the screen in editing tools (you have to open a special dialog or popup to see them) - attribute values can have no substructure - attribute values can be slightly more awkward to access in processing APIs - attributes are unordered, so there is no standard way to specify that one attribute's value should precede the other's (there is no guarantee that an API will give you the attributes in the same order that you specified them) My rule is to use attributes in markup just as I would use footnotes or endnotes in a book -- to provide extra information that is not part of the main content, but that is useful to know about it. By this rule, all of your examples are correct, but under different circumstances. > Example 1: > --------- > > In this case, all that really matters is that there's a book there. An XML document author might see in the main editing window, but get the attribute values in a pop-up only by clicking the mouse. It's not essential to know the book's title or author, and it is unlikely that anyone would want to search for it. Yes: insurance company list of property to be replaced; customs list of objects declared at border No: online bookstore; library catalogue > Example 2: > --------- > > The Call of the Wild In this one, the title matters but the author is just extra information. You'd probably use this for encoding a title inline, where the title will be printed as part of the paragraph (possibly in italics), but the author's name would appear only in a separate index or popup. I enjoyed the book The Call of the Wild. > Example 3: > --------- > > > The Call of the Wild > London, Jack > In this one, both the title and author are important -- you'd use this for the citation line of a quotation, in a bibliography, at an online bookstore, or in a library catalogue. I hope this helps. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Tue Apr 7 08:34:14 1998 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:00:26 2004 Subject: Free XML software? Message-ID: I'm making a list of all the currently available free XML software. I have a growing list at If anyone knows of a product not listed there (everything from Robin Covers and James Taubers sites is), please send me an email about it so that I can add it to the list. Thanks! -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Tue Apr 7 09:07:02 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:00:26 2004 Subject: SAXDOM Updated Message-ID: <002801bd61fb$3f76d580$2ee044c6@donpark> SAXDOM has been updated to support the lastest DOM specification (03/18/98). It is now at: http://www.docuverse.com/personal/saxdom.html Next update will probably come when SAX API itself is updated. Happy trail, Don Park http://www.docuverse.com/personal/index.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Leif.Jonsson at era-t.ericsson.se Tue Apr 7 11:17:23 1998 From: Leif.Jonsson at era-t.ericsson.se (Leif Jonsson) Date: Mon Jun 7 17:00:26 2004 Subject: Mixed content Message-ID: <3529EEBE.7CE8245@era-t.ericsson.se> Hello! My name is Leif Jonsson and I am a computer scientist student from Sweden. I have two basic questions, and I apologize if they are very obvious, but I haven't found a clear answer to it in any FAQ so I ask in this forum. 1. Why are the restrictions in MIXED CONTENT as they are, that is, no order between the elements are allowed and you can not specify the number of times they are to appear? 2. With the question above in mind, I assume there are some theoretical motivation to these restrictions, but how are you then supposed to achieve the above desired effects without adding "wrapper elements"? Is there a restructuring you can do or? Again I apologize if these are stupid questions, but at least then they should have short answers. =) Highest regards and greetings... -Leif Jonsson Leif.Jonsson@era-t.ericsson.se xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tug at wilson.co.uk Tue Apr 7 13:54:35 1998 From: tug at wilson.co.uk (John Wilson) Date: Mon Jun 7 17:00:26 2004 Subject: DTD versioning Message-ID: <04e001bd621b$cf698500$0a01d30a@bach.wilson.co.uk> I'm very new to this XML stuff, so perhaps I'm missing the point. Any advice gratefully received. I'm principally interested in using XML to pass data between computer programs. One of the problems I foresee is that I may well want to extend or change the DTD which represents the data exchanged as time goes by. However, I will have installed code which expects to see the data as described by a previous version of the DTD. I would expect the DTD spec to deal with this in one of three ways: 1/ To have a set of rules that specifies what changes or additions I can make to a DTD without breaking the original DTD(for example, adding an attribute or a default value to an existing attribute) and to have an automated way of checking that I've applied these rules. 2/ A way of specifying that the new DTD extends the old. 3/ A Relational database view type mechanism that gives a set of simple mechanical transformations that turns the data which corresponds to one DTD to another. In case 1/ the new DTD would just say that it is a valid superset of the old DTD, in case 2/ the DTD new DTD would specify the old and just contain the changes and additions, in case 3/ the DTD would refer to the view to be applied to transform the data to a form which is defined by the old DTD. I would expect the XML parser to serve up the data as described by the old DTD and that it (the parser) would suppress attributes, apply transformations, etc so that the data would appear in this form. I suspect that I can do some or all these things with XSL, but that's really hitting a small problem with a very big hammer. John Wilson The Wilson Partnership 5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK +44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax) Mailto: tug@wilson.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Per-Ake.Ling at uab.ericsson.se Tue Apr 7 14:58:38 1998 From: Per-Ake.Ling at uab.ericsson.se (Per-Ake Ling) Date: Mon Jun 7 17:00:26 2004 Subject: DTD versioning Message-ID: <199804071258.OAA07134@uabs19c27.eua.ericsson.se> > From: "John Wilson" ...[snip] Although you seem mainly interested in using XML to communicate information between programs, the problem you outline has been very relevant for the SGML community, dealing with documents. > 1/ To have a set of rules that specifies what changes or additions I can > make to a DTD without breaking the original DTD(for example, adding an > attribute or a default value to an existing attribute) and to have an > automated way of checking that I've applied these rules. > When we upgrade DTDs we do adhere to a certain set of rules, unfortunately we do not have automated support but do the checking manually. As far as I know there is no tools that specifically does this kind of job, but using e.g. the DTD reports in psgml helps a lot. > 2/ A way of specifying that the new DTD extends the old. > We always put a product id and a revision in the public identifier for the DTD. If the new DTD is strictly upwards compatible as in (1) above, we only change the revision, else we assign a new product id. The criteria for only changing the version number is that all old documenst can be parsed with the new DTD without any errors. Another way of recording this info is using major/minor numbers on the version to signal compatible or incompatible changes. > 3/ A Relational database view type mechanism that gives a set of simple > mechanical transformations that turns the data which corresponds to one DTD > to another. > Everytime we create a new DTD, we also create a filter for converting old documents to the new DTD by using an SP-based toolkit. In the trivial case of strictly upwards compatible changes, the filter is simply changing the public id for the external subset. > In case 1/ the new DTD would just say that it is a valid superset of the old > DTD, in case 2/ the DTD new DTD would specify the old and just contain the > changes and additions, in case 3/ the DTD would refer to the view to be > applied to transform the data to a form which is defined by the old DTD. > > I would expect the XML parser to serve up the data as described by the old > DTD and that it (the parser) would suppress attributes, apply > transformations, etc so that the data would appear in this form. > We do the opposite, the tools that read the data are updated and then the filters consistently serve up the data in the new form, irrespective of its 'real' version. > I suspect that I can do some or all these things with XSL, but that's really > hitting a small problem with a very big hammer. I consider this problem to be the other way around, using XSL would be using a very small hammer for a very large nail. In general, the DTD identification and versioning is very complex, especially since the DTD that is valid in a particular document does not really have an id since it is the sum of the internal and external subsets and could very well be unique for every document (although unlikely). Probably the solution lies with architectural forms, which although limited are quite adequate for many real-world problems. Eliot Kimber has argued for using AFs to control your data in a safe but extensible way, and I believe that is the only safe standard way that is currently reasonable to implement. (SP already has support for AF). Regards, Per-Åke -- Per-Åke Ling (note: Per-Åke, transliteration Per-Ake) email: Per-Ake.Ling@uab.ericsson.se phone: +46 8 727 5674 Ericsson Utvecklings AB mobile: +46 70 790 2446 AXE Research and Development fax: +46 8 727 3463 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Tue Apr 7 16:18:57 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:00:26 2004 Subject: When is an attribute an attribute? References: <3.0.1.32.19980406191209.0070f5b0@pop.uunet.ca> Message-ID: <352A367E.B4FCEF1C@infinet.com> I asked the same question about 4 months ago concerning using attributes vs. elements on this list and got some interesting answers. In that time I have found that for modeling objects a few principles come to mind. If you are modeling an object which will never change at all (like a Rectangle) then you would be best to do something like this: The rationale for this approach over using elements is that in most XML processors you will get all of the attribute values at once that are necessary for generally immutable objects like Rectangle's. In a particular application of mine I found that I would call setBounds() in java.awt.Component 4 times using the element approach vs. only once with the attributes approach. If you are representing something whose type may evolve over time like a user profile in a database, then the element approach I feel works better in the long run... Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murray at muzmo.com Tue Apr 7 17:33:04 1998 From: murray at muzmo.com (Murray Maloney) Date: Mon Jun 7 17:00:26 2004 Subject: When is an attribute an attribute? In-Reply-To: <352A367E.B4FCEF1C@infinet.com> References: <3.0.1.32.19980406191209.0070f5b0@pop.uunet.ca> Message-ID: <3.0.1.32.19980407112439.006b5e40@pop.uunet.ca> At 10:21 AM 4/7/98 -0400, Tyler Baker wrote: >If you are modeling an object which will never change at all >(like a Rectangle) then you would be best to do something like this: > > > This is a very good example of when attributes are optimal. In this case, the attributes are object properties, rather than children of the object. Even so, a RECTANGLE element could use containment to better advantage for cases where there are many, possibly disjoint name/value pairs or collections. 00 7in9in floral.jpeg gold blue
...
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Murray Maloney Email: murray@muzmo.com Technical Director Phone: (905) 509-9120 Veo Systems Fax: (905) 509-8637 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Make a Tax-Deductible Donation Yuri Rubinsky Insight Foundation http://www.yuri.org/donate.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From liamquin at interlog.com Tue Apr 7 18:23:26 1998 From: liamquin at interlog.com (Liam Quin) Date: Mon Jun 7 17:00:26 2004 Subject: DTD versioning In-Reply-To: <04e001bd621b$cf698500$0a01d30a@bach.wilson.co.uk> Message-ID: John Wilson asked about DTD versioning. Thre was some work done at OCLC on software to manipulate DTDs; try the Research section of www.oclc.org. Where I forsee this being important, I generally take one of two different tactics: [1] include a DTDrev attribute on the outermost "doctype" element, and make it #REQUIRED, Now every document must contain this to be valid, and I can therefore find documents that may need changing if the DTD's major revision changes. [2] include the DTD version in the filename, SYSTEM and/or PUBLIC identifier of the DTD. It's required in practice in a PUBLIC identifier for SGML, since resolving (fetching) the same PUBLIC identifier twice must always result in the same information. In XML, support for PUBLIC identifiers is optional, so it's best to avoid them and use only SYSTEM identifiers, but you can still include a version number there. For SGML, it's actually a fairly difficult problem to compute compatible DTD changes, especially in the face of the rules about inclusions and how they affect the interpretation of whitespace. XML does not have these complexities. None the less, a mechanical change is often the smallest part of the effort, because SGML has in the past been used mostly for representing documents rather than for database interchange. In documents, especially in the _descriptive_ or transcriptional use of SGML, well, an example: if i prevoisly had in my DTD and i now decide that some italic words are really people, so i add i must now inspect every or to see if perhaps a person's name has been put there, and change it. If I could do that automatically, I probably would have tagged it like that in the first place :) Not making existing documents syntactically invalid is one thing; not making them semantically invalid is another. For database interchange, the semantic issues can be controlled more tightly, I think, and this is an area that could be researched more. Lee -- Liam Quin -- the barefoot typographer -- Toronto lq-text: freely available Unix text retrieval IRC: discuss XML/SGML/XSL/XLL/DSSSL Mondays irc.technonet.net in #XML email address: l i a m q u i n, at host: i n t e r l o g dot c o m xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From RKoehler at able-inc.com Tue Apr 7 19:33:32 1998 From: RKoehler at able-inc.com (Rich Koehler) Date: Mon Jun 7 17:00:26 2004 Subject: When is an attribute an attribute? Message-ID: <30511AC98761D111976F0060082DE6E90A3265@cascade.able-inc.com> I've become fond of the method that Tim Bray used to distinguish between elements and attributes in his discussion of MCF (http://www.textuality.com/mcf/MCF-tutorial.html). He writes, "...when the property has a simple value like a string, we put that in the content of the element; when the property's value is another object, we put a pointer to it in an attribute value and leave the element decribing the property empty." This allows the creation of a directed linked graph, where objects refer to other objects, and the links can have attributes of their own. In your case it might look like this: Which allows you to define something like this: Jack London (206) 555-3423 Where the ID attributes are unique tokens for each object, and the UNIT attributes point to other objects. In this case we see that Jack London is a PERSON, who in the context of the book "The Call of the Wild" is an AUTHOR. Jack may appear in other objects, in other contexts, like: .... I think RDF will eventually address this. Anyway, that's my personal preference. Rich xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Wed Apr 8 01:41:51 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:00:26 2004 Subject: When is an attribute an attribute? References: <30511AC98761D111976F0060082DE6E90A3265@cascade.able-inc.com> Message-ID: <352AB962.7CF4@hiwaay.net> Rich Koehler wrote: > > I've become fond of the method that Tim Bray used to distinguish between > elements and attributes in his discussion of MCF > (http://www.textuality.com/mcf/MCF-tutorial.html). He writes, "...when > the property has a simple value like a string, we put that in the > content of the element; when the property's value is another object, we > put a pointer to it in an attribute value and leave the element > decribing the property empty." Neat! As others have pointed out, much depends not on the abstraction of the modeling technique, but on the method to be applied to the markup (ie, the application). If I want a tracking system for the person, the pointer techniques are good. If I want to render a title or find all titles, then the explicit element declaration is good. BTW: All of this is why DTDs have worked well for so many years. They are a contract between implementors and systems. The funniest thing I've seen lately is a statement on the Microsoft XML site that XML gets rid of committees who design DTDs in favor of a more "organic" approach. Lots of luck. ;-) len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elm at arbortext.com Wed Apr 8 03:46:43 1998 From: elm at arbortext.com (Eve L. Maler) Date: Mon Jun 7 17:00:26 2004 Subject: AttValue within XPointer ? In-Reply-To: <98Apr4.121723est.26883@thicket.arbortext.com> Message-ID: <3.0.5.32.19980407214405.00a07500@village.doctools.com> Sorry -- this is a known bug in the example in the spec; there should be quotes (or we should examine the syntax). We'll fix it next time around. Eve At 12:14 PM 4/4/98 -0500, Patrice Bonhomme wrote: > > > >I had a look at the brand new XML XPointer specification and something is not >clear to me. Some of the XPointer examples are not conformant to the XML >Language specification. > >I am not sure that an attribute value can start with a digit. And some >examples are: > >child(1,#element,N,2).(1,#element,N,1) >ancestor(1,#element,N,1).(1,DIV) >etc... > >I think they should be written (with the attribute value as a "SkipLit"): > >child(1,#element,N,"2").(1,#element,N,"1") >ancestor(1,#element,N,"1").(1,DIV) > >Could the editors of the XPtr Spec. precise the definition of the NAME >production. > >Thanks a lot. > >Pat. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Wed Apr 8 06:40:12 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:00:26 2004 Subject: When is an attribute an attribute? Message-ID: <005501bd62a8$a037ef50$980b4ccb@NT.JELLIFFE.COM.AU> -----Original Message----- From: len bullard > The funniest thing I've seen lately is a statement > on the Microsoft XML site that XML gets rid of > committees who design DTDs in favor of a > more "organic" approach. Lots of luck. ;-) :-) My book The XML & SGML Cookbook, due out next month, looks at this issue. In particular it gives some basic patterns and considerations that can be used for "rapid prototyping" a document type. Most document types require some rethought after deployment. Very few people actually have much of an idea of what their data contains. Anyway, when you start actually using markup systems you will want to make maximal use of the particular tools you have bought. So even if a DTD was created without any consideration of the software to be used, there is often good reason to enhance the DTD to make best use of the particular capabilities of the appliciations (and to overcome flaws that turn up). DTDs made by committees often tend to be rather kitchen-sinkish. But this is better dealt with by dividing them into separate DTDs (especially for front and backmatter), which are more manageable, or by introducing "training-wheel" DTDs which won't scare people off, rather than by saying they are over-engineered. Documents and publications are much more complicated than people want to accept: sometimes the only way is for people to learn by being given a simple DTD and then having issues in their documents prove to them that a larger DTD is actually what they require. "Organic" is an attractive word. Being able to make ad hoc changes to DTDs is great if you are processing them, or if you have a family of documents which are similar but not exactly the same type. SGML systems have suffered in the past because DTD-alterations was often a large-scale exercise for gurus. XML is doing good things in making this more difficult. But the idea that XML markup declarations are inherently inflexible, while declaration-less XML allows more "organic" development is spurious. One trick SGML people use (this is adapted from Travis and Waldt's book) is to make explicit element types for unaccounted-for elements. This gives you somewhere to park important data in the absense of DTD elements. This kind of flexibility is available in any DTD: you don't need to abandon XML markup declarations to get it. For example, the following declaration is a good basis for such an element type: ... Rover (Check out the HTML span and div elements too.) Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Wed Apr 8 09:45:20 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:00:26 2004 Subject: DTD versioning Message-ID: <01bd62c2$4d1d8680$1211e391@mhklaptop.bra01.icl.co.uk> Liam Quin: > In XML, support for PUBLIC identifiers is optional... Moreover, in XML public identifiers have no defined semantics (and very little defined syntax). Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Apr 9 00:44:30 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:00:26 2004 Subject: Handling unknown elements? Message-ID: <352BFE1D.32BF4F27@infinet.com> One dilemma I have been trying to figure out with XML is the problem of handling unknown element types and what to do with their children. For simple tree based data modeling this is pretty simple, if you come across an unknown element that the application does not understand, you just ignore it and all of its children. However what if like in the case of HTML an application may have mixed content where it understands the tag for boldface text but not understand the for italicized text. The actual character data may be a child of the element in this case. In case you anyone would like to know I have designed an XML Application framework that for now works fine for tree-based data modeling, but it really will have problems with documents that have all sorts of element (and their properties) applied to the character content, rather than with tree-based data modeling where you simply have elements as nodes and the leaf nodes have the actual character content stored in them. The only alternative for documents is to use something like a DOM tree or else an event based parser. The framework I have designed is pretty much what you could call object based in the sense that when the parser encounters a start or empty element tag it retrieves its name and asks the current parent element for an element to handle that tags attributes and content. Does anyone have any ideas for a solution that could be both object based, but document based as well? I have thought of maybe having an opaque "UNKNOWN" element handler object that would forward all requests queries for finding child elements to its parent element, but the problem with that is how do you know and tell the application if a particular tag should be treated as an object based tag where all of its children should certainly be ignored, or else you should simply join all of its children (symbolically) to the "UNKNOWN" tags parent tag. I know this might seem a little convoluted but here is what I am trying to say in XML Foo Bar Using the opaque "UNKNOWN" element it would look like this in tree form if the tag were unknown: | | | | "Foo" "Bar" Symbolically this could be represented as simply: | | "Foo" "Bar" Which in document format would evaluate to: | "FooBar" However, if I were to do all of this in Object format, any unknown child elements of which in this case happens to be the element would be skipped as well as all of the other sub elements contained in regardless of their type. The only solution I can possibly think of to this dilemma is to have each element object have a boolean flag that tells the XML Application Framework (which happens to be a parser now but could easily be built on top of SAX in 1/2 an hour) whether to ignore unknown child elements or else join the children of unknown child elements as children themselves. Anyone here got any better ideas on this? Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Thu Apr 9 03:12:04 1998 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:00:26 2004 Subject: Handling unknown elements? Message-ID: <01bd6369$7a1bcc80$21addccf@uspppBckman> >>One dilemma I have been trying to figure out with XML is the problem of >>handling unknown element types and what to do with their children. >>For simple tree based data modeling this is pretty simple, if you come >>across an unknown element that the application does not understand, you >.just ignore it and all of its children. XML is designed to work without a DTD, so an element that does not appear in the DTD should probably still be rendered by the user agent. Another source of knowledge about an element can come from a style sheet. If there is no reference to an element in the style sheet or the DTD, the user agent should probably just render it as an inline flow-object. As far as the attributes and their values, if there is no DTD the user agent should probably make an array of the attributes it comes across, so they can be queried. If there is a DTD the user agent should have the option of either validating and reporting an error if the document does not comply, or it may just check for well formedness. It would be nice if it did both. Frank -----Original Message----- From: Tyler Baker To: xml-dev@ic.ac.uk Date: Wednesday, April 08, 1998 3:47 PM Subject: Handling unknown elements? >One dilemma I have been trying to figure out with XML is the problem of >handling unknown element types and what to do with their children. > >For simple tree based data modeling this is pretty simple, if you come >across an unknown element that the application does not understand, you >just ignore it and all of its children. > >However what if like in the case of HTML an application may have mixed >content where it understands the tag for boldface text but not >understand the for italicized text. The actual character data may >be a child of the element in this case. > >In case you anyone would like to know I have designed an XML Application >framework that for now works fine for tree-based data modeling, but it >really will have problems with documents that have all sorts of element >(and their properties) applied to the character content, rather than >with tree-based data modeling where you simply have elements as nodes >and the leaf nodes have the actual character content stored in them. > >The only alternative for documents is to use something like a DOM tree >or else an event based parser. The framework I have designed is pretty >much what you could call object based in the sense that when the parser >encounters a start or empty element tag it retrieves its name and asks >the current parent element for an element to handle that tags attributes >and content. > >Does anyone have any ideas for a solution that could be both object >based, but document based as well? > >I have thought of maybe having an opaque "UNKNOWN" element handler >object that would forward all requests queries for finding child >elements to its parent element, but the problem with that is how do you >know and tell the application if a particular tag should be treated as >an object based tag where all of its children should certainly be >ignored, or else you should simply join all of its children >(symbolically) to the "UNKNOWN" tags parent tag. > >I know this might seem a little convoluted but here is what I am trying >to say in XML > > > > Foo > > > Bar > > > >Using the opaque "UNKNOWN" element it would look like this in tree form >if the tag were unknown: > > > | | > > | | > "Foo" "Bar" > >Symbolically this could be represented as simply: > > > | | > "Foo" "Bar" > >Which in document format would evaluate to: > > > | > "FooBar" > >However, if I were to do all of this in Object format, any unknown child >elements of which in this case happens to be the element would >be skipped as well as all of the other sub elements contained in >regardless of their type. > >The only solution I can possibly think of to this dilemma is to have >each element object have a boolean flag that tells the XML Application >Framework (which happens to be a parser now but could easily be built on >top of SAX in 1/2 an hour) whether to ignore unknown child elements or >else join the children of unknown child elements as children themselves. > >Anyone here got any better ideas on this? > >Tyler > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stever at orbital.co.uk Thu Apr 9 17:32:36 1998 From: stever at orbital.co.uk (Steve Robertson) Date: Mon Jun 7 17:00:26 2004 Subject: Comments within DTDs Message-ID: <01bd63cc$fb718140$379559c3@platypus.orbital.co.uk> I've been authoring some DTDs, and I am growing to like them! (An aquired taste?) One thing missing from the syntax of ATTLIST declarations is support for comments. This would have provided helpful clues if updating old schemas at some time in the future. Has this been a case of simplicity being given preference over readiblity (!), Or is there an historical reason behind the lack of such provisions? Steve Robertson. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at ora.com Thu Apr 9 17:48:35 1998 From: crism at ora.com (Chris Maden) Date: Mon Jun 7 17:00:26 2004 Subject: Comments within DTDs In-Reply-To: <01bd63cc$fb718140$379559c3@platypus.orbital.co.uk> (stever@orbital.co.uk) Message-ID: <199804091553.LAA16803@geode.ora.com> [Steve Robertson] > One thing missing from the syntax of ATTLIST declarations is support > for comments. This would have provided helpful clues if updating old > schemas at some time in the future. > > Has this been a case of simplicity being given preference over > readiblity (!), Or is there an historical reason behind the lack of > such provisions? It's for simplicity, but I don't feel it affects readability. The way I comment XML DTDs is thus: HTH, Chris -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jcupp at essc.psu.edu Thu Apr 9 21:22:52 1998 From: jcupp at essc.psu.edu (J. Cupp) Date: Mon Jun 7 17:00:26 2004 Subject: When is an attribute an attribute? References: Message-ID: <352D4A56.C21E3922@essc.psu.edu> Roy Tennant wrote: > > I've been trying to figure this out for a while with no success. > When and why would you choose one over another? > > > > The Call of the Wild > > > The Call of the Wild > London, Jack > I find it useful to think about the kinds of things you will want to do with your new XML documents like indexing and sharing data with a database. You may also be worried about disk space and the speed of indexing & retrieval. I always try to 1) Minimize redundancy in my data and 2) maximize the utility of my data. Since attrubutes are more strongly typed, I usually reserve them for unique databse keys: #1 The Call of the Wild or a text string that I wish to sort by: #2 The Call of the Wild If you're not worried about databases then you don't need the ID attribute, but the SORTFORM attribute might come in handy. If you go total database then you could have: #3 But doing this means you'd lose the ability to search on the book with an indexer (but not with an SQL query). Plus, it's less person-readable. Personally, I like #1. -- Jason R. Cupp (jcupp@essc.psu.edu) Deasy GeoGraphics The Pennsylvania State University xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dalapeyre at mulberrytech.com Thu Apr 9 21:27:08 1998 From: dalapeyre at mulberrytech.com (Deborah Aleyne Lapeyre) Date: Mon Jun 7 17:00:26 2004 Subject: Comments within DTDs (Teapot not tempest) Message-ID: Chris and company, 1) We do the same. Element name comment just before element. Attribute value comment between the element declaration and the ATTLIST. It's fine: Like good typesetting, we try for a small number of verticals. This also lets you scan quickly for full element name, tag name, or attribute name. 2) But I confess that I miss the old SGML days, where our house style looked more like this: Which was: 1) Easier to scan for elements (at least to my eye) 2) Put the attribute comment directly with its attribute, very nice if there are more than 10 attributes. 3) Easier (for me the weak programmer) to write a hack that would pull of the attrbute and its comment together. We lost this battle in the "ease of parsing" wars. No biggie, the new way works fine. --Debbie ====================================================================== Deborah A. Lapeyre Phone: 301/315-9631 Mulberry Technologies, Inc. Fax: 301/315-8285 17 West Jefferson Street, Suite 207 E-mail: dalapeyre@mulberrytech.com Rockville, MD 20850 WWW: http://www.mulberrytech.com ====================================================================== xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From robin at ACADCOMP.SIL.ORG Thu Apr 9 23:34:32 1998 From: robin at ACADCOMP.SIL.ORG (Robin Cover) Date: Mon Jun 7 17:00:26 2004 Subject: When is an attribute an attribute? Message-ID: <199804092141.QAA17295@ACADCOMP.SIL.ORG> I finally created a document for the (Grammar) "Topics" section in the SGML/XML Web Page w/ a few pointers to previous discussion on "elements versus attributes." The treatment by Steve DeRose in the SGML FAQ Book is especially thorough. See: http://www.sil.org/sgml/elementsAndAttrs.html which bears the title "SGML/XML: Elements versus Attributes. When Should I Use Elements, and When Should I Use Attributes? Cheers, Robin ------------------------------------------------------------------------- Robin Cover Email: robin@acadcomp.sil.org 6634 Sarah Drive Dallas, TX 75236 USA >>> The SGML/XML Web Page <<< Tel: +1 (972) 296-1783 (h) http://www.sil.org/sgml/sgml.html Tel: +1 (972) 708-7346 (w) FAX: +1 (972) 708-7380 ========================================================================= xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri Apr 10 00:54:02 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:00:26 2004 Subject: Comments within DTDs (Teapot not tempest) References: Message-ID: <352D512C.4D9C@hiwaay.net> Deborah Aleyne Lapeyre wrote: > > 1) We do the same. Element name comment just before > element. Attribute value comment between the element > declaration and the ATTLIST. It's fine: > > > > > id ID #IMPLIED > role (good|bad|ugly) "good" > whatever CDATA #REQUIRED > > > Like good typesetting, we try for a small number of > verticals. This also lets you scan quickly for full > element name, tag name, or attribute name. Yes. Hard on long DTDs though. IMHO, it easier to maintain markup with the comments in a separate file and link them. Helps the reuse across the system. > 2) But I confess that I miss the old SGML days, where > our house style looked more like this: > > > > > -- id Unique identifier for element -- > id ID #IMPLIED > > -- role One of: > good looks like Toshiro Mifune > bad looks like Clint Eastwood > ugly looks like Paula Jones -- > role (good|bad|ugly) "good" > > -- whatever I don't care; put something > here -- > whatever CDATA #REQUIRED > > > Which was: > 1) Easier to scan for elements (at least to my eye) Yes, if you want to read the attribute name and comment at the same time eg, learning the DTD. Still, this style always confuses me because I am looking for the productions and they are scattered in a lot of text. I agree your layout is the most scannable. > 2) Put the attribute comment directly with its > attribute, very nice if there are more than 10 > attributes. Yes. Particularly when cutting and pasting in the DTD documentation. In the production DTD, the fact that there are that many attributes means the file gets very big. So, sure in the archival/authoritative DTD, but in production code, it's a lot of stuff. Having it in a separate file makes it easier to manage for reuse. > 3) Easier (for me the weak programmer) to write a > hack that would pull the attrbute and its > comment together. Yes. > We lost this battle in the "ease of parsing" wars. > No biggie, the new way works fine. Yes. Hope all is well with you! len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Apr 10 03:16:28 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:26 2004 Subject: When is an attribute an attribute? Message-ID: <3.0.32.19980409022909.0073c84c@pop.intergate.bc.ca> At 10:35 AM 07/04/98 -0700, Rich Koehler wrote: >I've become fond of the method that Tim Bray used to distinguish between >elements and attributes in his discussion of MCF >(http://www.textuality.com/mcf/MCF-tutorial.html). He writes, "...when >the property has a simple value like a string, we put that in the >content of the element; when the property's value is another object, we >put a pointer to it in an attribute value Hmm, recommendations do come back to haunt one. I guess I'd sign up for part of that statement - putting URL^HIs in attribute values rather than element content seems to be a basic part of Web culture, and has come to feel natural. But I'm on the record as saying I've never heard a convincing universal decision procedure for what should be an element and what an attribute. There are *some* things where it's easy - if you need further internal structure, use an element, for example. But my beliefs on this are: 1. I have observed that humans doing document design are comfortable with having two ways of labeling content, and assign things to elements or attributes for reasons that seem good to them. 2. It is often the case, after designers have done this, that they change their minds. I know I do. 3. For all of these reasons, if software needs to extract information from XML documents, it should be prepared to extract it either from character data or from attribute values, and not struggle against what the designer decided. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Fri Apr 10 05:20:20 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:00:26 2004 Subject: Root Element Attributes... Message-ID: <352D900D.6E008DC3@infinet.com> "There is exactly one element, called the root, or document element, no part of which appears in the content of any other element. For all other elements, if the start-tag is in the content of another element, the end-tag is in the content of the same element. More simply stated, the elements, delimited by start- and end-tags, nest properly within each other. As a consequence of this, for each non-root element C in the document, there is one other element P in the document such that C is in the content of P, but is not in the content of any other element that is in the content of P. P is referred to as the parent of C, and C as a child of P." I am a little confused by whether attributes are allowed in the root element of a document. To date I have never seen a DTD in which the root element had any attributes whatsoever, but from the specification it appears that there is no restriction on making one root element with about 500 or so attributes (I am not saying I would ever do this). Can anyone please clarify this who is maybe an SGML literate as well? Thanx in advance, Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Fri Apr 10 06:47:52 1998 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:00:26 2004 Subject: Root Element Attributes... References: <352D900D.6E008DC3@infinet.com> Message-ID: <352DA445.40392B52@allette.com.au> Tyler Baker wrote: > I am a little confused by whether attributes are allowed in the root > element of a document. To date I have never seen a DTD in which the > root element had any attributes whatsoever, but from the specification > it appears that there is no restriction on making one root element with > about 500 or so attributes (I am not saying I would ever do this). Yes, the root element is as capable as any other element of supporting attributes. An element is regarded as the root only by virtue of occurrence; it is the first element to occur in the instance and it does not have any siblings. Therefore any element can be the root or doctype element - it depends only on where you begin regarding the structure from. -- Regards Marcus Carr email: mrc@allette.com.au _______________________________________________________________ Allette Systems (Australia) email: info@allette.com.au Level 10, 91 York Street www: http://www.allette.com.au Sydney 2000 NSW Australia phone: +61 2 9262 4777 fax: +61 2 9262 4774 _______________________________________________________________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Apr 10 13:21:47 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:27 2004 Subject: Announcement: SAX Java Implementation (pre-release) Message-ID: <199804101121.HAA00492@unready.microstar.com> [I'm limiting this announcement to XML-DEV right now.] I have put together a new, beta version of SAX with quite a few changes; here are some change highlights: - ability to parse from a byte stream or character stream - ability to locate (the end of) any document handler event in the source document - events for notation and unparsed entity declarations - _much_ simpler interface for attributes - extensible error-reporting mechanism, including the ability to provide localized error messages **Please** do not release new versions of your software based on this: while I don't expect major changes, I would like to take a one- or two-week bug-fixing period before we release this to the world at large. I have not yet updated the SAX web pages or written a specification or even a README for this (the only documentation is in the extensive JavaDoc comments), but while I'm working on documentation and packaging, I thought that it would be useful to release a snapshot of the Java-based SAX reference implementation so that parser and application developers could start porting, evaluating, and testing their code. You can download the snapshot from the the following URL: http://www.microstar.com/XML/SAX/New/saxjava-19980410.zip This includes source, class files, and a single (5K) Jar file with all of the SAX interfaces and helper classes. Of course, you'll want to be able to play with SAX. I have a 99%-complete driver for AElfred, but am not finished testing the AElfred changes that went with it, so I cannot release that quite yet. I have, however, put together modified versions of my Lark driver (which is pretty good) and my MSXML driver (which is more limited because of MSXML problems), and posted them to the following URL: http://www.microstar.com/XML/SAX/New/saxdrivers-19980410.zip When I get the new AElfred out (soon), you'll be able to experiment with entity resolution as well. Finally, I have a small demo which does a rather boring identity transform: http://www.microstar.com/XML/SAX/New/saxdemos-19980410.zip All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dvp4c at jefferson.village.virginia.edu Fri Apr 10 13:51:20 1998 From: dvp4c at jefferson.village.virginia.edu (Daniel Pitti) Date: Mon Jun 7 17:00:27 2004 Subject: XLINK discussion Message-ID: <3.0.1.32.19980410075048.00a7a8e0@jefferson.village.virginia.edu> Is there a separate list for XLink discussion? Or is xml-dev the appropriate venue for now? Daniel Pitti Daniel V. Pitti Project Director Institute for Advanced Technology in the Humanities Alderman Library University of Virginia Charlottesville, Virginia 22903 Phone: 804 924-6594 Fax: 804 982-2363 Email: dpitti@Virginia.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Fri Apr 10 17:28:10 1998 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:00:27 2004 Subject: Announcement: SAX Java Implementation (pre-release) In-Reply-To: <199804101121.HAA00492@unready.microstar.com> References: <199804101121.HAA00492@unready.microstar.com> Message-ID: I've now done a Python translation of this and put it (also a pre-release) at (SAX classes, some convenience classes, xmllib driver and an unfinished xmlproc driver.) The translation was very straightforward, with one exception: overloading. Unless there are protests I think I'll just leave the two extra parse methods and the extra exception constructor out of the Python translation altogether. Also, the distinction between the two new parse methods looks very Java-specific to me. -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Fri Apr 10 22:48:11 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:00:27 2004 Subject: Announcement: SAX Java Implementation (pre-release) References: <199804101121.HAA00492@unready.microstar.com> Message-ID: <352E8624.187BD043@infinet.com> Lars Marius Garshol wrote: > I've now done a Python translation of this and put it (also a > pre-release) at > > > > (SAX classes, some convenience classes, xmllib driver and an > unfinished xmlproc driver.) > > The translation was very straightforward, with one exception: > overloading. Unless there are protests I think I'll just leave the two > extra parse methods and the extra exception constructor out of the > Python translation altogether. > > Also, the distinction between the two new parse methods looks very > Java-specific to me. The idea of character streams is for supporting streams which already handle the byte to UNICODE character translation for you. I am not familiar with Python, but I assume there must be some sort of 16 bit wide character set that is Unicode compliant. This I feel is all that David is trying to add to SAX, not something Java specific. Nevertheless, if a particular language has no Unicode support whatsoever in the first place, then you are SOL for supporting XML in the first place. Readers (or character streams) may have much more efficient byte to character translation implementations than anything the parser could provide in the first place. I think the addition of character streams to SAX is a very good thing. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sat Apr 11 12:51:20 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:00:27 2004 Subject: Announcement: SAX Java Implementation (pre-release) References: <199804101121.HAA00492@unready.microstar.com> Message-ID: <352F48D3.A69FC818@jclark.com> David Megginson wrote: > I have put together a new, beta version of SAX with quite a few > changes This looks good. I have some nits: 1. Why has a SAX prefix been added to all classes? 2. For consistency with SAXException, in SAXLocator getSystemId should return null if no system id is available, and getLineNumber, getColumnNumber should similarily return -1 if no line or column number is available. 3. The interface for reading character streams needs more specification if it is to be interoperable. a) There's a critical ambiguity in the concept of a character stream: a Java concept of a char does not correspond to the XML concept of a character. A character outside the BMP is a single XML character but is represented by a pair of Java chars. If you want to use the Java Reader interface, then a character stream must be a stream not of characters in the XML sense but in the Java sense. I don't have the Unicode standard handy, but it has precisely defined terms for these two different things; I suggest referencing the Unicode standard and using the appropriate term. b) Is it legal for a byte order mark character to be present at the start of the character stream? The right answer is that it should not be legal: this should be stripped out in the byte to character conversion process. c) How does this interact with the encoding declaration in the XML document? The docs should say that it's legal for the character stream to include an encoding declaration and it doesn't matter what encoding it specifies. 4. The doc for SAXDTDHandler should say that the order in which DTD events are fired is unspecified except that they will be all be fired after startDocument and before startElement. 5. Maybe the name of SAXDTDHandler should be changed to reflect the fact that it is not attempting to be a complete DTD interface. Some future version of SAX might provide optional support for full DTDs and it would be nice to be able to use the name SAXDTDHandler as the name for that. 6. I strongly object to including the name argument in SAXEntityResolver.resolveEntity. There's nothing in XML that says that the name should be used in resolving an entity and so there's no reason to suppose a parser will make it available. I also think it's wrong in principle to make use of it. This business with "[document]" and "[dtd]" is gross. At the very least the spec should say that name maybe null if this information is not available. 7. Is the first character on the line at column 0 or column 1? (GNU Emacs says column, but others say column 1.) The docs need to make this clear. 8. I don't think SAXException.getLocalizedMessage is the right approach to internationalization. Although the JDK does have Throwable.getLocalizedMessage, as far as I can tell nothing uses it and it's not at all convenient. It would be better to have a setLocale(Locale locale) method on SAXParser that specified the locale in which messages should be returned. This is the approach that is used in AWT. In any case SAXException.getLocalizedMessage is entirely redundant since SAXException has Throwable as an indirect superclass, and Throwable includes an identical definition of getLocalizedMessage. 9. I think SAXHandlerBase.error and SAXHandlerBase.warning should be no-ops like almost all the other methods. Having the default be to print messages on System.err introduces a command-line bias that seems inappropriate to me. In addition using a PrintStream (which System.err is) is irretrievably broken from an internationalization perspective, as is made clear in the PrintStream docs. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sat Apr 11 21:22:52 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:27 2004 Subject: Announcement: SAX Java Implementation (pre-release) References: <199804101121.HAA00492@unready.microstar.com> <352E8624.187BD043@infinet.com> Message-ID: <352FC31C.F485909E@technologist.com> Tyler Baker wrote: > > > Also, the distinction between the two new parse methods looks very > > Java-specific to me. > > The idea of character streams is for supporting streams which already handle the > byte to UNICODE character translation for you. I am not familiar with Python, > but I assume there must be some sort of 16 bit wide character set that is Unicode > compliant. This I feel is all that David is trying to add to SAX, not something > Java specific. True, but "InputStream" and "Reader" are neither well-knowns ADT nor well-defined interfaces outside of the Java world. Suppose a SAX client and server are talking over a CORBA connection. What are the requirements on the object passed to the parseInputStream( ) and parseReader() methods (non-overloaded variants of the parse method). Where are these requirements defined? Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Unwisely, Santa offered a teddy bear to James, unaware that he had been mauled by a grizzly earlier that year." - Timothy Burton, "James" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Apr 12 23:43:47 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:27 2004 Subject: XLINK discussion In-Reply-To: <3.0.1.32.19980410075048.00a7a8e0@jefferson.village.virginia.edu> References: <3.0.1.32.19980410075048.00a7a8e0@jefferson.village.virginia.edu> Message-ID: <199804122058.QAA00293@unready.microstar.com> Daniel Pitti writes: > Is there a separate list for XLink discussion? Or is xml-dev the > appropriate venue for now? XML-DEV is a good place to discuss implementation issues for XLink. I'd suggest that you forward any general comments or questions directly to the editors (they are listed at the top of the WD). All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Apr 12 23:45:06 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:27 2004 Subject: Announcement: SAX Java Implementation (pre-release) In-Reply-To: <352FC31C.F485909E@technologist.com> References: <199804101121.HAA00492@unready.microstar.com> <352E8624.187BD043@infinet.com> <352FC31C.F485909E@technologist.com> Message-ID: <199804122133.RAA00310@unready.microstar.com> Paul Prescod writes: > True, but "InputStream" and "Reader" are neither well-knowns ADT nor > well-defined interfaces outside of the Java world. Suppose a SAX client > and server are talking over a CORBA connection. What are the > requirements on the object passed to the parseInputStream( ) and > parseReader() methods (non-overloaded variants of the parse method). > Where are these requirements defined? James has made this same point somewhat differently. Here is my question: do we need SAXByteStream and SAXCharacterStream interfaces? If so, should SAX also come with default implementations for each language? All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Apr 12 23:45:22 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:27 2004 Subject: Announcement: SAX Java Implementation (pre-release) In-Reply-To: <352F48D3.A69FC818@jclark.com> References: <199804101121.HAA00492@unready.microstar.com> <352F48D3.A69FC818@jclark.com> Message-ID: <199804122131.RAA00308@unready.microstar.com> James Clark writes: > David Megginson wrote: > > > I have put together a new, beta version of SAX with quite a few > > changes > > This looks good. I have some nits: Actually, these are very good points, all of which deserve detailed answers, and several of which are so self-evidently correct that I'd like to make the changes right away. Could anyone implementing SAX (on either the parser or application side) please read through this entire reply? There are several points where I'd appreciate feedback. > 1. Why has a SAX prefix been added to all classes? There are a few benefits to this decision: 1. Programmers can import SAX classes into their own namespaces with less danger of collision (they will often have their own "Parser" and "DocumentHandler" classes). Experienced programmers might snort at this, but I have had several messages from people who couldn't understand why their code wasn't compiling properly. 2. I save a lot of time that I would have had to spend helping people who still had the old Java SAX classes somewhere on their CLASSPATH. 3. Porting to languages like C, which don't have namespaces, becomes clearer (though the use of overloading in SAXParser will still cause trouble -- perhaps I should fix that before the final release). > 2. For consistency with SAXException, in SAXLocator getSystemId should > return null if no system id is available, and getLineNumber, > getColumnNumber should similarily return -1 if no line or column > number is available. Absolutely correct -- I will change the documentation. > 3. The interface for reading character streams needs more > specification if it is to be interoperable. > a) There's a critical ambiguity in the concept of a character stream: > a Java concept of a char does not correspond to the XML concept of a > character. A character outside the BMP is a single XML character but > is represented by a pair of Java chars. If you want to use the Java > Reader interface, then a character stream must be a stream not of > characters in the XML sense but in the Java sense. I don't have the > Unicode standard handy, but it has precisely defined terms for these > two different things; I suggest referencing the Unicode standard and > using the appropriate term. The real challenge here is to define the level of interoperability that we need. My first impulse is to leave "byte stream" and "character stream" deliberately undefined, so that each language can use its native implementation (if one is available). I think most users will find life easier if in Java, for example, they can use java.io.InputStream for a byte stream and java.io.Reader for a character stream; C++ programmers can use istream for a byte stream and whatever the ANSI committee is considering for character streams; etc., etc. This is somewhat messy, since (as you correctly point out) the exact behaviour becomes language-specific (that is one reason that I didn't include these in the first pass); I am reluctant, though, to create SAXByteStream and SAXCharacterStream, and to force everyone to use wrappers for their InputStream/Reader-type classes. What does everyone else think about this point? Is this a good case for pragmatism over logical consistency, or am I introducing an ugly kludge that will come back to haunt us all? > b) Is it legal for a byte order mark character to be present at the > start of the character stream? The right answer is that it should not > be legal: this should be stripped out in the byte to character > conversion process. This is a tricky point. I had planned to leave it in -- what is the default behaviour for java.io.Reader (and for other languages with character streams)? > c) How does this interact with the encoding declaration in the XML > document? The docs should say that it's legal for the character > stream to include an encoding declaration and it doesn't matter what > encoding it specifies. I'd think that it should be ignored under these circumstances, since the characters are already decoded (though again, in an underspecified way -- are we dealing with UCS-2, UCS-4, or UTF-16?). > 4. The doc for SAXDTDHandler should say that the order in which DTD > events are fired is unspecified except that they will be all be fired > after startDocument and before startElement. Thanks. I will change this. > 5. Maybe the name of SAXDTDHandler should be changed to reflect the > fact that it is not attempting to be a complete DTD interface. Some > future version of SAX might provide optional support for full DTDs and > it would be nice to be able to use the name SAXDTDHandler as the name > for that. I thought for a while about this -- SAXDocumentHandler also provides only partial document information, so I was thinking that we would have something like public interface SAX2DocumentHandler extends SAXDocumentHandler { } and public interface SAX2DTDHandler extends SAXDTDHandler { } in SAX2 (any suggestions for a better prefix than "SAX2" will be gratefully acknowledged). > 6. I strongly object to including the name argument in > SAXEntityResolver.resolveEntity. There's nothing in XML that says > that the name should be used in resolving an entity and so there's no > reason to suppose a parser will make it available. I also think it's > wrong in principle to make use of it. This business with "[document]" > and "[dtd]" is gross. At the very least the spec should say that name > maybe null if this information is not available. I'm neutral on this point, though I do agree that "[document]" and "[dtd]" are ugly. Does anyone object to the removal of the name argument? > 7. Is the first character on the line at column 0 or column 1? (GNU > Emacs says column, but others say column 1.) The docs need to make > this clear. The first character is in column 1. I will fix the docs. > 8. I don't think SAXException.getLocalizedMessage is the right > approach to internationalization. Although the JDK does have > Throwable.getLocalizedMessage, as far as I can tell nothing uses it > and it's not at all convenient. It would be better to have a > setLocale(Locale locale) method on SAXParser that specified the locale > in which messages should be returned. This is the approach that is > used in AWT. In any case SAXException.getLocalizedMessage is entirely > redundant since SAXException has Throwable as an indirect superclass, > and Throwable includes an identical definition of getLocalizedMessage. This was a last-minute addition before release: the redundancy is deliberate, since non-Java implementations will not inherit getLocalizedMessage. I will gladly bend to the will of the localisation experts (or at least, cognoscenti) on this list -- if SAXParser.setLocale() is a better approach, then I am happy to use it. > 9. I think SAXHandlerBase.error and SAXHandlerBase.warning should be > no-ops like almost all the other methods. Having the default be to > print messages on System.err introduces a command-line bias that seems > inappropriate to me. In addition using a PrintStream (which > System.err is) is irretrievably broken from an internationalization > perspective, as is made clear in the PrintStream docs. And even worse, it's not clear how useful printing to STDERR is when the parser is running in a distributed environment. I agree fully, and will change the default behaviour. All the best, and thank you very much for the feedback. David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Apr 13 00:20:16 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:27 2004 Subject: My favorite XML app so far Message-ID: <3.0.32.19980412151652.00b39900@pop.intergate.bc.ca> Check it out at http://www.honeylocust.com/limon/xml/ -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Apr 13 00:34:15 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:27 2004 Subject: Announcement: SAX Java Implementation (pre-release) Message-ID: <3.0.32.19980412151952.00b2e5e4@pop.intergate.bc.ca> At 05:33 PM 12/04/98 -0400, David Megginson wrote: >Here is my question: do we need SAXByteStream and SAXCharacterStream >interfaces? If so, should SAX also come with default implementations >for each language? At the bottom level, a characterStream interface is a good idea. Having constructors for these that take an asciiByteStream, an iso8859ByteStream and (I guess) a UTF8InputStream and produce seems awfully convenient. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Apr 13 00:34:19 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:27 2004 Subject: Announcement: SAX Java Implementation (pre-release) Message-ID: <3.0.32.19691231160000.00b2fb94@pop.intergate.bc.ca> At 05:33 PM 12/04/98 -0400, David Megginson wrote: >Here is my question: do we need SAXByteStream and SAXCharacterStream >interfaces? If so, should SAX also come with default implementations >for each language? At the bottom level, a characterStream interface is a good idea. Having constructors for these that take an asciiByteStream, an iso8859ByteStream and (I guess) a UTF8InputStream and produce seems awfully convenient. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Apr 13 00:41:27 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:27 2004 Subject: Announcement: SAX Java Implementation (pre-release) Message-ID: <3.0.32.19980412153124.00b3778c@pop.intergate.bc.ca> At 05:31 PM 12/04/98 -0400, David Megginson wrote: > > 1. Why has a SAX prefix been added to all classes? > >There are a few benefits to this decision: Kind of unconvincing, I'd have to say. If someone doesn't have it together enough to figure out how to use java packages they're not going to have much luck with SAX anyhow. And we really shouldn't be worrying about legacy SAX implementatoins at this stage; we're all bleeding-edge types around here. And if somebody wants a C binding, that's going to be different enough from Java SAX anyhow that we shouldn't do the SAX prefix just because they're going to have to. > > 3. The interface for reading character streams needs more > > specification if it is to be interoperable. > > > a) There's a critical ambiguity in the concept of a character stream: > > a Java concept of a char does not correspond to the XML concept of a > > character. >What does everyone else think about this point? Is this a good case >for pragmatism over logical consistency, or am I introducing an ugly >kludge that will come back to haunt us all? Is it maybe the right thing to be brutally clear and just have a UTF-16 character stream? I haven't looked at Java chars as closely as James has, but his description sounds exactly like UTF-16. A 16-bit UTF-16 quantity is not precisely a character, but the places where it isn't (non BMP chars) exhibit graceful degradation; if the app knows about UTF-16 it does the right thing, otherwise it looks like two unknown characters, nothing breaks. > > b) Is it legal for a byte order mark character to be present at the > > start of the character stream? The right answer is that it should not > > be legal: this should be stripped out in the byte to character > > conversion process. > >This is a tricky point. I had planned to leave it in -- what is the >default behaviour for java.io.Reader (and for other languages with >character streams)? No; if there's a BOM, that should be eaten by the underlying char stream machinery, which should read it and thereafter transparently swap bytes or not to produce Java chars without the app having to work at it. The spec is clear on this point, and at one with sensible implementation practice. > > c) How does this interact with the encoding declaration in the XML > > document? The docs should say that it's legal for the character > > stream to include an encoding declaration and it doesn't matter what > > encoding it specifies. > >I'd think that it should be ignored under these circumstances, since >the characters are already decoded (though again, in an underspecified >way -- are we dealing with UCS-2, UCS-4, or UTF-16?). Once again, I think the spec can be followed straightforwardly on this one. If you know, by any combination of BOM, encoding decl, and external header, what the encoding is, just use it. I think SAX implementations should compete on their ability to Do The Right Thing. > > 6. I strongly object to including the name argument in > > SAXEntityResolver.resolveEntity. There's nothing in XML that says > > that the name should be used in resolving an entity and so there's no > > reason to suppose a parser will make it available. I also think it's > > wrong in principle to make use of it. This business with "[document]" > > and "[dtd]" is gross. At the very least the spec should say that name > > maybe null if this information is not available. > >I'm neutral on this point, though I do agree that "[document]" and >"[dtd]" are ugly. Does anyone object to the removal of the name >argument? I'm with James - any use of the entity name by an application is potentially actively harmful, nuke it. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Mon Apr 13 02:10:48 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:00:27 2004 Subject: Announcement: SAX Java Implementation (pre-release) References: <199804101121.HAA00492@unready.microstar.com> <352F48D3.A69FC818@jclark.com> <199804122131.RAA00308@unready.microstar.com> Message-ID: <352F5D2D.98570E7D@infinet.com> David Megginson wrote: > James Clark writes: > > David Megginson wrote: > > > > > I have put together a new, beta version of SAX with quite a few > > > changes > > > > This looks good. I have some nits: > > Actually, these are very good points, all of which deserve detailed > answers, and several of which are so self-evidently correct that I'd > like to make the changes right away. > > Could anyone implementing SAX (on either the parser or application > side) please read through this entire reply? There are several points > where I'd appreciate feedback. > > > 1. Why has a SAX prefix been added to all classes? > > There are a few benefits to this decision: > > 1. Programmers can import SAX classes into their own namespaces > with less danger of collision (they will often have their > own "Parser" and "DocumentHandler" classes). Experienced programmers > might snort at this, but I have had several messages from people > who couldn't understand why their code wasn't compiling properly. This is very much more evident I have found with method naming collisions than actual class naming collisions. The class naming collisions can be fixed by explicitly naming the entire package qualified name for classes which collide. With interfaces there is no mechanism to do this. If you have a class which implements two interfaces that may have the same method declarations and signature, then you are pretty much SOL as far as figuring out what should be returned. For example java.security.Principal defines String getName() which could be applied to all sort of contexts in a Java/XML application that has security implications. For the interfaces for the object based parser I have written, I for the element interface, instead of declaring String getName() I use the more cumbersome String getElementName(). In the long run this sort of redundant naming will make your design challenges a little easier I feel. > 2. I save a lot of time that I would have had to spend helping people > who still had the old Java SAX classes somewhere on their > CLASSPATH. You should not design around people being sloppy with installations. Perhaps you could have use InstallShield or a script which would install the latest SAX and desinstall the older SAX versions, or at the very minimum change the CLASSPATH so this problem does not happen. > I thought for a while about this -- SAXDocumentHandler also provides > only partial document information, so I was thinking that we would > have something like > > > 7. Is the first character on the line at column 0 or column 1? (GNU > > Emacs says column, but others say column 1.) The docs need to make > > this clear. > > The first character is in column 1. I will fix the docs. It would be nice for programming purposes (at least for Java, C, and C++) if columns and rows had their index starting at 0. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Mon Apr 13 02:28:29 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:00:27 2004 Subject: Announcement: SAX Java Implementation (pre-release) References: <3.0.32.19980412151952.00b2e5e4@pop.intergate.bc.ca> Message-ID: <35315D68.BBCF0DCB@infinet.com> Tim Bray wrote: > At 05:33 PM 12/04/98 -0400, David Megginson wrote: > >Here is my question: do we need SAXByteStream and SAXCharacterStream > >interfaces? If so, should SAX also come with default implementations > >for each language? > > At the bottom level, a characterStream interface is a good idea. Having > constructors for these that take an asciiByteStream, an iso8859ByteStream > and (I guess) a UTF8InputStream and produce seems awfully convenient. -Tim This seems like a pretty reasonable idea. This would also allow parser writers to provide their own byte to character converters which may in fact be more efficient than the sun.io.ByteToCharConverter and sun.io.CharToByteConverter classes which do all of this stuff now in the JDK. These implementations are undocumented which is another problem that a CharacterStream interface might be nice to have as an alternative to the Reader classes. Of course a character stream interface could of course delegate to a Reader class if the parser writer chose two which would be a relatively low cost in performance considering the Reader classes are all screwed up anyways IMHO. In this particular case, a CharacterStream implementation may take a Reader object as an argument and the rest is just delegation. Most poeple already know that the I/O implementation of the JDK is not too great as it is, for example every time you read a single character from the Reader class, it instantiates a new Array object as an argument necessary for delegating to another read method. Why things work this way in the Reader class I don't really know other than possibly the person who wrote it was either lazy or else forgot to improve the implementation at a later time. I think Tim's idea is a good one... Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elm at arbortext.com Mon Apr 13 06:45:28 1998 From: elm at arbortext.com (Eve L. Maler) Date: Mon Jun 7 17:00:27 2004 Subject: XLINK discussion In-Reply-To: <98Apr12.174657edt.26881@thicket.arbortext.com> References: <3.0.1.32.19980410075048.00a7a8e0@jefferson.village.virginia.edu> <3.0.1.32.19980410075048.00a7a8e0@jefferson.village.virginia.edu> Message-ID: <98Apr13.004317edt.26882@thicket.arbortext.com> Steve DeRose and I do track XML-Dev and other XML-related lists for XLink issues, but people should try to stick to XML-Dev's original mission of discussing implementation, and send any specific XLink comments and requests directly to us. Thanks, Eve At 04:58 PM 4/12/98 -0400, David Megginson wrote: >Daniel Pitti writes: > > > Is there a separate list for XLink discussion? Or is xml-dev the > > appropriate venue for now? > >XML-DEV is a good place to discuss implementation issues for XLink. >I'd suggest that you forward any general comments or questions >directly to the editors (they are listed at the top of the WD). > > >All the best, > > >David > >-- >David Megginson ak117@freenet.carleton.ca >Microstar Software Ltd. dmeggins@microstar.com > http://home.sprynet.com/sprynet/dmeggins/ > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dima at paragraph.com Mon Apr 13 14:09:25 1998 From: dima at paragraph.com (Dmitri Kondratiev) Date: Mon Jun 7 17:00:27 2004 Subject: AElfred SAX driver ? Message-ID: <2.2.32.19980413120922.009684ec@dream.paragraph.com> Why AElfred SAX driver (Version 1.1) belongs to different package then other drivers : com.microstar.sax.LarkDriver com.microstar.sax.MSXMLDriver and : com.microstar.xml.SAXDriver Thanks, Dima --------------------------- dima@paragraph.com 102401.2457@compuserve.com http://www.geocities.com/SiliconValley/Lakes/3767/ tel: 07-095-464-9241 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Apr 13 16:18:04 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:27 2004 Subject: SAX: Character Streams In-Reply-To: <35315D68.BBCF0DCB@infinet.com> References: <3.0.32.19980412151952.00b2e5e4@pop.intergate.bc.ca> <35315D68.BBCF0DCB@infinet.com> Message-ID: <199804131416.KAA00792@unready.microstar.com> Tyler Baker writes: > > At the bottom level, a characterStream interface is a good idea. > > Having constructors for these that take an asciiByteStream, an > > iso8859ByteStream and (I guess) a UTF8InputStream and produce > > seems awfully convenient. -Tim > This seems like a pretty reasonable idea. This would also allow > parser writers to provide their own byte to character converters > which may in fact be more efficient than the > sun.io.ByteToCharConverter and sun.io.CharToByteConverter classes > which do all of this stuff now in the JDK. Just a quick note: using java.io.Reader would not require people to use Sun's character converters, unless the implementor chose to use java.io.InputStreamReader or something similar. You could always subclass Reader to use your own character converters anyway. That said, I'm happy to be persuaded to use SAXCharacterStream (or SAXUTF16Stream) and SAXByteStream -- they'd make life much easier for CORBA types. The Java people will complain quite a bit that I'm not using Reader and InputStream, but I can prepare a friendly canned reply for them. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Apr 13 16:20:12 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:27 2004 Subject: SAX, non-XML Documents, and Legal Characters Message-ID: <199804131400.KAA00783@unready.microstar.com> While we're on the topic of character streams, here's another question: should the SAXDocumentHandler.characters() method be allowed to deliver only XML characters? At first, the answer "no" might seem self-evident, but what if someone decides to build a LaTeX or RTF parser that implements the SAX interface? Should we require the parser to strip out non-XML characters before delivering the SAX events, or should we allow SAX to be a general structured-document interface, and require applications to strip out non-XML characters when exporting an XML document? The question is, of course, moot for XML parsers, since they will have to report a fatal error anyway if they find non-XML characters. It would be interesting, though, to build an RTF parser with a SAX driver and then hook it up to Don Park's SAXDOM. Any thoughts? All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Apr 13 16:20:31 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:27 2004 Subject: Announcement: SAX Java Implementation (pre-release) In-Reply-To: <3.0.32.19980412153124.00b3778c@pop.intergate.bc.ca> References: <3.0.32.19980412153124.00b3778c@pop.intergate.bc.ca> Message-ID: <199804131405.KAA00786@unready.microstar.com> Tim Bray writes: > At 05:31 PM 12/04/98 -0400, David Megginson wrote: > > > 1. Why has a SAX prefix been added to all classes? > > > >There are a few benefits to this decision: > > Kind of unconvincing, I'd have to say. If someone doesn't have it > together enough to figure out how to use java packages they're > not going to have much luck with SAX anyhow. And we really shouldn't > be worrying about legacy SAX implementatoins at this stage; we're > all bleeding-edge types around here. And if somebody > wants a C binding, that's going to be different enough from > Java SAX anyhow that we shouldn't do the SAX prefix just because > they're going to have to. If no one wants this, I will happily remove it. If you do want the "SAX..." prefix on all interfaces, please speak up now. > > > 3. The interface for reading character streams needs more > > > specification if it is to be interoperable. > > > > > a) There's a critical ambiguity in the concept of a character stream: > > > a Java concept of a char does not correspond to the XML concept of a > > > character. > >What does everyone else think about this point? Is this a good case > >for pragmatism over logical consistency, or am I introducing an ugly > >kludge that will come back to haunt us all? > > Is it maybe the right thing to be brutally clear and just have a UTF-16 > character stream? I haven't looked at Java chars as closely as James > has, but his description sounds exactly like UTF-16. A 16-bit UTF-16 > quantity is not precisely a character, but the places where it isn't > (non BMP chars) exhibit graceful degradation; if the app knows about > UTF-16 it does the right thing, otherwise it looks like two unknown > characters, nothing breaks. Fair enough -- we could specify that SAXCharacterStream is a UTF-16 stream, or we could even name it SAXUTF16Stream. How will this interact with Larry Wall's decision to use UTF-8 as the internal encoding for the next Perl? > > > b) Is it legal for a byte order mark character to be present at the > > > start of the character stream? The right answer is that it should not > > > be legal: this should be stripped out in the byte to character > > > conversion process. > > > >This is a tricky point. I had planned to leave it in -- what is the > >default behaviour for java.io.Reader (and for other languages with > >character streams)? > > No; if there's a BOM, that should be eaten by the underlying char stream > machinery, which should read it and thereafter transparently swap bytes > or not to produce Java chars without the app having to work at it. > The spec is clear on this point, and at one with sensible implementation > practice. Should we require all versions to use the Java byte order, or only the Java version? > > > 6. I strongly object to including the name argument in > > > SAXEntityResolver.resolveEntity. There's nothing in XML that says > > > that the name should be used in resolving an entity and so there's no > > > reason to suppose a parser will make it available. I also think it's > > > wrong in principle to make use of it. This business with "[document]" > > > and "[dtd]" is gross. At the very least the spec should say that name > > > maybe null if this information is not available. > > > >I'm neutral on this point, though I do agree that "[document]" and > >"[dtd]" are ugly. Does anyone object to the removal of the name > >argument? > > I'm with James - any use of the entity name by an application is > potentially actively harmful, nuke it. -Tim That's two 'nays' and one abstention (mine). If anyone wants to keep the entity name argument, please put your case forward quickly. Thanks, and all the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Apr 13 16:21:22 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:27 2004 Subject: SAX: Method Name Collisions In-Reply-To: <352F5D2D.98570E7D@infinet.com> References: <199804101121.HAA00492@unready.microstar.com> <352F48D3.A69FC818@jclark.com> <199804122131.RAA00308@unready.microstar.com> <352F5D2D.98570E7D@infinet.com> Message-ID: <199804131412.KAA00789@unready.microstar.com> Tyler Baker writes: > > 1. Programmers can import SAX classes into their own namespaces > > with less danger of collision (they will often have their > > own "Parser" and "DocumentHandler" classes). Experienced programmers > > might snort at this, but I have had several messages from people > > who couldn't understand why their code wasn't compiling properly. > > This is very much more evident I have found with method naming > collisions than actual class naming collisions. The class naming > collisions can be fixed by explicitly naming the entire package > qualified name for classes which collide. With interfaces there is > no mechanism to do this. If you have a class which implements two > interfaces that may have the same method declarations and > signature, then you are pretty much SOL as far as figuring out what > should be returned. The biggest offender is SAXAttributeList: public int getLength (); public String getName (int i); public String getType (int i); public String getValue (int i); public String getType (String name); public String getValue (String name); To minimise method-name collisions, we would have to do something like this: public int getAttributeListLength (); public String getAttributeName (int i); public String getAttributeType (int i); public String getAttributeValue (int i); public String getAttributeType (String name); public String getAttributeValue (String name); The first, "getAttributeListLength", is the ugliest. It is simple to avoid this problem by creating a separate class for SAXAttributeList, rather than implementing it in the main driver -- what does everyone else think about this question? Thanks, and all the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dima at paragraph.com Mon Apr 13 18:09:19 1998 From: dima at paragraph.com (Dmitri Kondratiev) Date: Mon Jun 7 17:00:27 2004 Subject: AElfred SAX driver ? Message-ID: <2.2.32.19980413160910.009c28d0@dream.paragraph.com> At 09:54 13.04.98 -0400, David Megginson wrote: >Dmitri Kondratiev writes: > > > Why AElfred SAX driver (Version 1.1) belongs to different package then other > > drivers : > > > > com.microstar.sax.LarkDriver > > com.microstar.sax.MSXMLDriver > > > > and : > > > > com.microstar.xml.SAXDriver > >This is a very good question. Originally, there was a >com.microstar.sax.AElfredDriver that was distributed together with >LarkDriver, MSXMLDriver, and NXPDriver. Eventually, I would like to >see all of the com.microstar.sax.* drivers disappear, as these parsers >include native SAX drivers in their own packages. I removed >com.microstar.sax.AElfredDriver when I included a native SAX driver in >the AElfred 1.1 distribution. > This means that support for SAX in AElfred becomes a _special case_ that application will have to recognize ! Assume an application that first finds out what SAX driver is available from its environment and then instantiates available driver. In the case with AElfred "native" driver - application will have to have extra "non-standard" code to handle this. It doesn't seem to be a good solution to me. Besides, in general case this makes working with AElfred driver difficult, which in effect eliminates it as a possible solution. Dima --------------------------- dima@paragraph.com 102401.2457@compuserve.com http://www.geocities.com/SiliconValley/Lakes/3767/ tel: 07-095-464-9241 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Mon Apr 13 19:40:03 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:00:27 2004 Subject: Confusion with Namespaces? Message-ID: <35324F40.BF578903@infinet.com> In the 1.0 XML Namespaces specification a namespace PI is defined as follows: '' >From this it looks like a particular namespace must have either one or more PrefixDef's, NS'Def's, or SrcDef's in any particular order so long as there is at least one of these. But then the following is said: Namespace Constraint: Required Parts A namespace declaration must contain exactly one NSDef, exactly one PrefixDef and zero or one SrcDef. So based upon this shouldn't a namespace PI be defined as follows: '' I guess that: '' could be construed as that the Definitions can occur in any order they wish, so long as there is only one DSDef and PrefixDef while any number of SrcDefs can occur within the namespace declaration. For XML Processors, search engines and the like I think it would make much more sense to define a namespace PI as something like this: '' Any thoughts here? Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Mon Apr 13 20:42:43 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:00:27 2004 Subject: SAX, non-XML Documents, and Legal Characters Message-ID: <005501bd670b$02089520$2ee044c6@donpark> >The question is, of course, moot for XML parsers, since they will have >to report a fatal error anyway if they find non-XML characters. It >would be interesting, though, to build an RTF parser with a SAX driver >and then hook it up to Don Park's SAXDOM. David, IMHO, power of XML lies in it being the hub (some would say bottleneck). I think it would be far more flexible to have converters that translates LaTex and RTF into XML which can then be processed by any SAX parser. Ideal processing of legacy documents in the XML realm involves four phases: 1. Conversion phase converts legacy documents into XML documents with emphasis on loss-less capturing of the original information. Little emphasis is placed on how information will be used. This step will be done typically by content owner. 2. Distillation phase extracts useful components of XML documents into one or more ready-for-processing XML documents with emphasis on providing the most useful and flexible form of information. This step is typically done by value added information vendors as well as content owners. 3. Distribution phase involves transmitting processed XML documents to the clients with the most emphasis placed on catering to the consumption phase. This step is done by the application servers. 4. Consumption phase involves client software converting XML documents into consumable formats such as HTML, RTF, LaTex, etc. The emphasis in the consumption phase is on user preference. This step is done by the client software. So, my vote is no. Regards, Don Park http://www.docuverse.com/personal/index.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Apr 13 22:07:47 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:27 2004 Subject: SAX: Method Name Collisions Message-ID: <3.0.32.19980413130250.00b347bc@pop.intergate.bc.ca> At 10:12 AM 13/04/98 -0400, David Megginson wrote: >The first, "getAttributeListLength", is the ugliest. It is simple to >avoid this problem by creating a separate class for SAXAttributeList, >rather than implementing it in the main driver -- what does everyone >else think about this question? Well, yes. But I thought you were dead set against having any extra classes? -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Apr 13 22:07:48 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:27 2004 Subject: Confusion with Namespaces? Message-ID: <3.0.32.19980413130433.00b31c1c@pop.intergate.bc.ca> At 01:45 PM 13/04/98 -0400, Tyler Baker wrote: >In the 1.0 XML Namespaces specification a namespace PI is defined as >follows: > >'' > >>From this it looks like a particular namespace must have either one or >more PrefixDef's, NS'Def's, or SrcDef's in any particular order so long >as there is at least one of these. But then the following is said: > >Namespace Constraint: Required Parts >A namespace declaration must contain exactly one NSDef, exactly one >PrefixDef and zero or one SrcDef. This is because a majority of the WG didn't want to specify the order in which they can appear. So to write a regexp that allows one ns & one prefix & zero or one src, in any order, gets pretty big & ugly, unless you have something like the SGML &-connector. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Mon Apr 13 22:11:54 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:00:27 2004 Subject: Confusion with Namespaces? Message-ID: <006f01bd6718$2e89c9e0$a00b4ccb@NT.JELLIFFE.COM.AU> From: Tyler Baker > Any thoughts here? It would be better if they just said somewhere "the namespace PI has the same syntax as XML element start-tags" and then given the following pseudo-declaration: The production they probably want is probably more like [1] ' This is a case where, to me, the markup declararation syntax has clear advantages over BNF. And I dread to think what it would look like in some version of XML-data element syntax. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Apr 13 22:16:21 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:27 2004 Subject: AElfred SAX driver ? In-Reply-To: <2.2.32.19980413160910.009c28d0@dream.paragraph.com> References: <2.2.32.19980413160910.009c28d0@dream.paragraph.com> Message-ID: <199804132015.QAA00313@unready.microstar.com> Dmitri Kondratiev writes: > >This is a very good question. Originally, there was a > >com.microstar.sax.AElfredDriver that was distributed together with > >LarkDriver, MSXMLDriver, and NXPDriver. Eventually, I would like to > >see all of the com.microstar.sax.* drivers disappear, as these parsers > >include native SAX drivers in their own packages. I removed > >com.microstar.sax.AElfredDriver when I included a native SAX driver in > >the AElfred 1.1 distribution. > > > > This means that support for SAX in AElfred becomes a _special case_ that > application will have to recognize ! Just the opposite. Most Java-based XML parsers now have native SAX drivers in their own packages: XP: com.jclark.xml.sax.Driver DXP: com.datachannel.xml.sax.DXPDriver IBM XML for Java: com.ibm.xml.parser.SAXDriver AElfred: com.microstar.xml.SAXDriver The two drivers in com.microstar.sax are the special cases, and they will disappear once MSXML and Lark have native SAX drivers. As for your other question, about determining what drivers are available, there are a few options: I am going to recommend that we use the property "sax.parser" to provide the class name for the default SAX driver in Java implementations (I don't know the politics of property names, though). All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Apr 13 22:23:04 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:27 2004 Subject: SAX: Method Name Collisions In-Reply-To: <3.0.32.19980413130250.00b347bc@pop.intergate.bc.ca> References: <3.0.32.19980413130250.00b347bc@pop.intergate.bc.ca> Message-ID: <199804132022.QAA00349@unready.microstar.com> Tim Bray writes: > At 10:12 AM 13/04/98 -0400, David Megginson wrote: > >The first, "getAttributeListLength", is the ugliest. It is simple to > >avoid this problem by creating a separate class for SAXAttributeList, > >rather than implementing it in the main driver -- what does everyone > >else think about this question? > > Well, yes. But I thought you were dead set against having any > extra classes? -Tim If I'd stuck with this, there would still be only two interfaces in SAX: the parser and the handler. What I'm talking about here, though, is an extra class in a specific parser's implementation. Most SAX parsers will implement SAXParser, SAXAttributeList, and SAXLocator with the same class; they are free, however, to use a separate class for each if method-name collisions become a problem. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bmhughes at ozemail.com.au Tue Apr 14 13:27:33 1998 From: bmhughes at ozemail.com.au (Baden Hughes) Date: Mon Jun 7 17:00:27 2004 Subject: possibility of an RTF, LaTex XML conversion process In-Reply-To: <005501bd670b$02089520$2ee044c6@donpark> Message-ID: <3.0.5.32.19980414210105.007bd8f0@ozemail.com.au> Both Don Park and David Megginson mentioned this in the last few posts: conversion mechanisms from RTF and LaTex to XML. Anyone know of any work on either of these ? I'd be very interested to hear about it. If not, is there any other interest in these conversions besides mine ? Baden xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Tue Apr 14 14:21:43 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:28 2004 Subject: possibility of an RTF, LaTex XML conversion process In-Reply-To: <3.0.5.32.19980414210105.007bd8f0@ozemail.com.au> References: <005501bd670b$02089520$2ee044c6@donpark> <3.0.5.32.19980414210105.007bd8f0@ozemail.com.au> Message-ID: <199804141220.IAA00323@unready.microstar.com> Baden Hughes writes: > Both Don Park and David Megginson mentioned this in the last few > posts: conversion mechanisms from RTF and LaTex to XML. Anyone know > of any work on either of these ? I'd be very interested to hear > about it. If not, is there any other interest in these conversions > besides mine ? Simply doing a lexical transformation of RTF or vanilla LaTeX to XML is, if not trivial, at least straight-forward; the challenge is getting from the very loose structure and semantics of RTF or LaTeX to a real-world XML document type. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Tue Apr 14 15:04:24 1998 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 17:00:28 2004 Subject: RDF at AlphaWorks Message-ID: IBM's produced a Java RDF (Resource Description Framework) implementation that looks interesting, though it claims to use XML 'forms'. We'll see what they mean by that. Could be worth a poke. Only 100K. http://www.alphaWorks.ibm.com/formula/rdfxml Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Tue Apr 14 15:17:59 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:00:28 2004 Subject: Announcement: SAX Java Implementation (pre-release) Message-ID: <01bd67a7$c384c6a0$1e09e391@mhklaptop.bra01.icl.co.uk> >I have put together a new, beta version of SAX with quite a few >changes; here are some change highlights... Only one technical comment: if we allow a character stream to be passed to the parser as the primary input, it does seem inconsistent that resolveEntity() cannot return a character stream for other entities, e.g. the DTD. (In fact, if resolveEntity() could return a stream/reader, one could rely on this to provide a stream/reader for the primary input as well, simplifying the main parse() interface). Other comments are on the documentation: - I think we should recommend people implementing the callback handlers to subclass from the base implementation, as this allows their code to remain compatible if the interface is widened in a later SAX version. - The specifications tend to say "you" when they mean the application writer, and "the parser" when they mean the parser writer. It would be useful to spell out the roles more clearly (call them the parser and the application) and make it very clear for each interface whether it is supplied by the parser or by the application. Mike Kay, ICL xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From szpak at well.com Tue Apr 14 15:53:37 1998 From: szpak at well.com (Mark Szpakowski) Date: Mon Jun 7 17:00:28 2004 Subject: Netscape XML Newsgroup In-Reply-To: <01bd67a7$c384c6a0$1e09e391@mhklaptop.bra01.icl.co.uk> Message-ID: <199804141353.GAA15267@smtp.well.com> Netscape now has an XML newsgroup (netscape.dev.xml), as part of its DevEdge program: http://developer.netscape.com:90/members/doc/subscriber/doc/newsgroups/xml. html Details on Mozilla's (Communicator 5) support for XML have been posted at http://www.mozilla.org/rdf/doc/xml.html Cheers, Mark ___________________________________ Mark Szpakowski Systems Architect WorldNav Learning Systems Inc. Halifax, Nova Scotia, Canada mark.szpakowski@learningengine.com Voice 902/422-1577 Fax 902/422-4964 ___________________________________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stober at iag.net Tue Apr 14 17:40:18 1998 From: stober at iag.net (Robert Stober) Date: Mon Jun 7 17:00:28 2004 Subject: Problems parsing XML Message-ID: <35333D06.45A2F6F@iag.net> Hi all, I've been working on a java xml browser application. I got the base code from the book "XML Complete" by Steven Holzner. Some of the code shown below comes directly from that book, then I show my modifications. I am using the msxml parser version 1.8. The book I referred to to was using an earlier version and the packages have changed. In fact Java itself may have changed. This code is from the aforementioned book page 231 and is supposed to get the document root, then it's children, then it should enumerate over those children. void showRecord(int recordNumber) { Element root = d.getRoot(); Element elem = null, elem2 = null; Enumeration enum = root.getChildren(); for(int index = 0; index <= recordNumber; index++){ elem = (Element)enum.nextElement(); } Enumeration enum2 = elem.getChildren(); for(int index = 0; index < 2; index++){ elem2 = (Element)enum2.nextElement(); if (elem2.getTagName().equals("FIRSTNAME")) { text1.setText(elem2.getText()); } if (elem2.getTagName().equals("LASTNAME")) { text2.setText(elem2.getText()); } } } Using Sun's JDK 1.1.5 this code gives errors: explicit cast need to convert class ElementCollection to java.util.Enumeration. This is because the line: Enumeration enum = root.getChildren(); // getChildren return ElementCollection getChildren() returns an object of type ElementCollection not Enumeration. So I made some changes to the code as shown below: void showRecord(int recordNumber) { Element root = d.getRoot(); Element elem = null, elem2 = null; ElementEnumeration enum = new ElementEnumeration(root); for(int index = 0; index <= recordNumber; index++) { elem = (Element)enum.nextElement(); } ElementEnumeration enum2 = new ElementEnumeration(elem); for(int index = 0; index < 2; index++) { elem2 = (Element)enum2.nextElement(); if (elem2.getTagName().equals("FIRSTNAME")) { text1.setText(elem2.getText()); } if (elem2.getTagName().equals("LASTNAME")) { text2.setText(elem2.getText()); } } So now I'm using the msxml ElementEnumeration class which seems like it should work. And it compiles just fine. But when I run it something goes wrong... Document root is: com.ms.xml.om.ElementImpl[tag=DOCUMENT, type=0, text=null] elem is: com.ms.xml.om.ElementImpl[tag=null,type=12, text= ] java.lang.NullPointerException: at employees.showRecord(employees.java:111) at employees.init(employees.java:70) Here's the xml document I'm trying to parse. You'll see that the next tag is NAME not null! ]> Franklin Tom Guertin Phoebe Johnson Frank Tomlin Brenda Edwards Tina Okay, so what am I doing wrong? Has anybody else run into this? Any help any of you could provide will be greatly appreciated! Robert Stober stober@iag.net stoberrm@orl.wec.com -------------- next part -------------- A non-text attachment was scrubbed... Name: vcard.vcf Type: text/x-vcard Size: 376 bytes Desc: Card for Robert Stober Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19980414/28f15aff/vcard.vcf From crism at ora.com Tue Apr 14 17:55:44 1998 From: crism at ora.com (Chris Maden) Date: Mon Jun 7 17:00:28 2004 Subject: Problems parsing XML In-Reply-To: <35333D06.45A2F6F@iag.net> (message from Robert Stober on Tue, 14 Apr 1998 10:40:07 +0000) Message-ID: <199804141600.MAA06892@geode.ora.com> [Robert Stober] > I've been working on a java xml browser application. I got the base > code from the book "XML Complete" by Steven Holzner. Some of the > code shown below comes directly from that book, then I show my > modifications. Is there a reason you need to write your own parser? There are at least five publicly available (James Clark, Microstar, DataChannel, Tim Bray, Microsoft). One fundamental flaw in _XML Complete_ is Holzner's apparent belief that you must write Java code in order to do anything useful with XML. If you *want* to write a parser as an interesting excerise, great; have fun. But I just wanted to make sure you were aware that you absolutely don't need to. -Chris -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Tue Apr 14 18:09:54 1998 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:00:28 2004 Subject: Problems parsing XML In-Reply-To: <199804141600.MAA06892@geode.ora.com> References: <199804141600.MAA06892@geode.ora.com> Message-ID: * Chris Maden | | Is there a reason you need to write your own parser? There are at | least five publicly available (James Clark, Microstar, DataChannel, | Tim Bray, Microsoft). My count has now passed 17, and another 2-3 packages with parsing abilities. SAX[1] is heartily recommended. :-) [1] -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From weaver at corvusdev.com Tue Apr 14 18:58:17 1998 From: weaver at corvusdev.com (Mark Weaver) Date: Mon Jun 7 17:00:28 2004 Subject: Problems parsing XML Message-ID: <01bd67c5$35ed3c00$9ae68dce@markw-p133> > >Okay, so what am I doing wrong? Has anybody else run into this? > >Any help any of you could provide will be greatly appreciated! > I'm not sure what's wrong with your existing code, but you might want to try using Element.getChild() instead of the ElementEnumeration class. We use that method to walk the XML tree (with MSXML) and it works fine. Mark Weaver Corvus Development, Inc. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Tue Apr 14 20:04:50 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:28 2004 Subject: Announcement: SAX Java Implementation (pre-release) In-Reply-To: <01bd67a7$c384c6a0$1e09e391@mhklaptop.bra01.icl.co.uk> References: <01bd67a7$c384c6a0$1e09e391@mhklaptop.bra01.icl.co.uk> Message-ID: <199804141804.OAA00307@unready.microstar.com> Michael Kay writes: > >I have put together a new, beta version of SAX with quite a few > >changes; here are some change highlights... > > Only one technical comment: if we allow a character stream > to be passed to the parser as the primary input, it does seem > inconsistent that resolveEntity() cannot return a character stream > for other entities, e.g. the DTD. (In fact, if resolveEntity() could return > a stream/reader, one could rely on this to provide a stream/reader > for the primary input as well, simplifying the main parse() interface). This is trivial in Java (where you could simply check the type of the returned object), but quite complicated in languages that lack dynamic typing. The only way to handle this in a language-independent way would be to introduce Yet Another Class to SAX: public abstract class SAXInputSource { public final static int SYSTEM_ID = 1; public final static int CHARACTER_STREAM = 2; public abstract int getType(); public abstract String getSystemId(); public abstract SAXCharacterStream getCharacterStream(); } I'm actually wondering how badly we need the SAXEntityResolver at all -- it seems to me that URI redirection is a very general problem that belongs outside of SAX (say, in the system libraries or in a proxy server). The only real benefit right now is that SAXEntityResolver allows an application writer to do something useful with public identifiers. > Other comments are on the documentation: > > - I think we should recommend people implementing the callback > handlers to subclass from the base implementation, as this allows > their code to remain compatible if the interface is widened in a > later SAX version. For Java users, the problem is that the handler would then be unable to inherit from anything else; in any case, I was thinking of widening the interface by extension, not by modification. > - The specifications tend to say "you" when they mean the application > writer, and "the parser" when they mean the parser writer. It would be > useful to spell out the roles more clearly (call them the parser and the > application) and make it very clear for each interface whether it is > supplied by the parser or by the application. This is an excellent point -- I will add it to my TODO list. Thanks, and all the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From signell at physnet.pa.msu.edu Tue Apr 14 20:42:03 1998 From: signell at physnet.pa.msu.edu (signell@physnet.pa.msu.edu) Date: Mon Jun 7 17:00:28 2004 Subject: possibility of an RTF, LaTex XML conversion process (fwd) Message-ID: <199804141841.OAA03726@physnet.pa.msu.edu> A non-text attachment was scrubbed... Name: not available Type: text Size: 1446 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19980414/b901fe77/attachment.bat From mrc at allette.com.au Wed Apr 15 00:06:03 1998 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:00:28 2004 Subject: possibility of an RTF, LaTex XML conversion process References: <3.0.5.32.19980414210105.007bd8f0@ozemail.com.au> Message-ID: <3533DD8F.539DEB5A@allette.com.au> Baden Hughes wrote: > Both Don Park and David Megginson mentioned this in the last few posts: > conversion mechanisms from RTF and LaTex to XML. Anyone know of any work on > either of these ? I'd be very interested to hear about it. If not, is there > any other interest in these conversions besides mine ? Rick Geimer wrote a conversion from RTF to XML that uses OMLE - OmniMark's free version. You can get Rick's code at http://www.sesha.com/omlette/ and OMLE at http://www.omnimark.com/develop/index.html. I have test-driven it and find it to be very good - it gives you something like what Rainbow did, but has support for CALS tables, etc. If you were to write another conversion stage on to the end, you could have some fairly specific XML with minimum pain. -- Regards Marcus Carr email: mrc@allette.com.au _______________________________________________________________ Allette Systems (Australia) email: info@allette.com.au Level 10, 91 York Street www: http://www.allette.com.au Sydney 2000 NSW Australia phone: +61 2 9262 4777 fax: +61 2 9262 4774 _______________________________________________________________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Wed Apr 15 00:37:45 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:00:28 2004 Subject: possibility of an RTF, LaTex XML conversion process (fwd) Message-ID: <00ad01bd67f4$fdb4fd20$2ee044c6@donpark> >Amen! I am up-converting a technical book in LaTeX that has literally >thousands of format directives, each of which must be replaced by >a descriptor showing the author's intent. I used Perl to do some >automatically, but about half needed decisions by a content expert. My recommendation would be to do a dumb translation of LaTeX into XML. By doing so, you are deferring all the critical decisions which, if made prematurely, could cause information loss and taint. Once you have the XML-lized LaTeX document you have a core document to create more application-oriented XML documents from. For example, if you are interested in duplicating the layout of the original LaTeX document, you could extract the layout information and create a PGML document. If you are interested in an indexable XML document, you can extract the contents and structural elements and massage them into an easily indexable format. At later point, you can inject elements representing the author's intent as well as some other content expert's interpretation (such element should have an attribute indicating the point of view). Regards, Don Park http://www.docuverse.com/personal/index.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From srn at techno.com Wed Apr 15 02:06:14 1998 From: srn at techno.com (Steven R. Newcomb) Date: Mon Jun 7 17:00:28 2004 Subject: Problems parsing XML In-Reply-To: <199804141600.MAA06892@geode.ora.com> (message from Chris Maden on Tue, 14 Apr 1998 12:00:41 -0400) References: <199804141600.MAA06892@geode.ora.com> Message-ID: <199804142304.TAA01028@bruno.techno.com> [Chris Maden :] > One fundamental flaw in _XML Complete_ is Holzner's apparent belief > that you must write Java code in order to do anything useful with > XML. Yes. I'm following a forum about "XML in Python" that's pretty interesting. It's definitely not Java. http://www.python.org/mailman/listinfo/xml-sig -Steve -- Steven R. Newcomb, President, TechnoTeacher, Inc. srn@techno.com http://www.techno.com ftp.techno.com voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137) fax +1 972 994 0087 (at ISOGEN: +1 214 953 3152) 3615 Tanner Lane Richardson, Texas 75082-2618 USA xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Wed Apr 15 05:36:41 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:00:28 2004 Subject: Problems parsing XML References: <199804141600.MAA06892@geode.ora.com> <199804142304.TAA01028@bruno.techno.com> Message-ID: <35342AE5.3B1C@hiwaay.net> Steven R. Newcomb wrote: > > [Chris Maden :] > > > One fundamental flaw in _XML Complete_ is Holzner's apparent belief > > that you must write Java code in order to do anything useful with > > XML. > > Yes. I'm following a forum about "XML in Python" that's pretty > interesting. It's definitely not Java. > > http://www.python.org/mailman/listinfo/xml-sig Markup doesn't care. That's the beauty of it. :- len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Wed Apr 15 06:34:32 1998 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:00:28 2004 Subject: Netscape XML Newsgroup References: <199804141353.GAA15267@smtp.well.com> Message-ID: <353438A3.1F0A978A@allette.com.au> Mark Szpakowski wrote: > Netscape now has an XML newsgroup (netscape.dev.xml), as part of its > DevEdge program: > > http://developer.netscape.com:90/members/doc/subscriber/doc/newsgroups/xml. > html Are you certain of this address? I have tried sporadically over the past seven hours and have not been able to connect. -- Regards Marcus Carr email: mrc@allette.com.au _______________________________________________________________ Allette Systems (Australia) email: info@allette.com.au Level 10, 91 York Street www: http://www.allette.com.au Sydney 2000 NSW Australia phone: +61 2 9262 4777 fax: +61 2 9262 4774 _______________________________________________________________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Wed Apr 15 06:45:37 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:00:28 2004 Subject: Announcement: SAX Java Implementation (pre-release) References: <3.0.32.19980412151952.00b2e5e4@pop.intergate.bc.ca> <35315D68.BBCF0DCB@infinet.com> Message-ID: <3531CEFE.D90242F6@jclark.com> Tyler Baker wrote: > Most poeple already know that the I/O implementation of the JDK is not too great > as it is, for example every time you read a single character from the Reader > class, it instantiates a new Array object as an argument necessary for delegating > to another read method. It's optimized for reading multiple characters. Using the single character read would be a total loss however it's implemented because of synchronization overhead. I don't think there's much wrong with the Reader class itself. InputStreamReader, however, leaves something to be desired because it doesn't allow users to supply their own character-to-byte conversion routines. But if you have an InputStream you should be using the interface to the parser that takes an InputStream. In any case it's not practical to use an InputStreamReader for XML because that won't deal with XML's rules for detecting encodings. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From szpak at well.com Wed Apr 15 12:25:38 1998 From: szpak at well.com (Mark Szpakowski) Date: Mon Jun 7 17:00:28 2004 Subject: Netscape XML Newsgroup In-Reply-To: <353438A3.1F0A978A@allette.com.au> References: <199804141353.GAA15267@smtp.well.com> Message-ID: <199804151024.DAA27483@smtp.well.com> At 02:33 PM 4/15/98 +1000, Marcus Carr wrote: >Are you certain of this address? I have tried sporadically over the past seven >hours and have not been able to connect. The most direct route is to point your newsreader to secnews.netscape.com, and then subscribe to netscape.dev.xml: that works, I just tried it. - Mark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Wed Apr 15 13:29:14 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:28 2004 Subject: SAX: Byte Stream Needed? In-Reply-To: <3531CEFE.D90242F6@jclark.com> References: <3.0.32.19980412151952.00b2e5e4@pop.intergate.bc.ca> <35315D68.BBCF0DCB@infinet.com> <3531CEFE.D90242F6@jclark.com> Message-ID: <199804151128.HAA00297@unready.microstar.com> James Clark writes: [...] > InputStreamReader, however, leaves something to be desired because > it doesn't allow users to supply their own character-to-byte > conversion routines. But if you have an InputStream you should be > using the interface to the parser that takes an InputStream. In > any case it's not practical to use an InputStreamReader for XML > because that won't deal with XML's rules for detecting encodings. I have actually been toying with omitting the byte-stream parse() method altogether, so that there would be only two parse methods: public abstract void parse (String publicId, String systemId) throws java.lang.Exception; public abstract void parse (String publicId, String systemId, SAXCharacterStream input) throws java.lang.Exception; I've defined SAXCharacterStream as follows: public interface SAXCharacterStream { public abstract int read () throws SAXException; public abstract int read (char ch[], int start, int count) throws SAXException; } (Where SAXException is, in the Java version, a direct and unmodified subclass of java.io.IOException). The result of either method is -1 if there are no characters left to read; otherwise, it is a UTF-16 character value for the first, and the number of characters read for the second. The advantage of using SAXCharacterStream is that behaviour over CORBA (or, I suppose, DCOM) is now well-defined. The disadvantage is another bloody interface. I had also written a SAXByteStream, but then I started wondering why we really need it -- information coming from a database, for example, or from a buffer should already be in characters, not in raw bytes (and in Java, at least, it is simply to wrap a Reader around any InputStream when necessary -- I expect that other languages will have good internationalisation support soon). Can anyone put forward a convincing case for having a standard SAX method parsing from a raw byte stream (remembering that implementations can always extend the SAXParser interface themselves for special requirements)? Thanks, and all the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dserres at kingston.hummingbird.com Wed Apr 15 14:40:31 1998 From: dserres at kingston.hummingbird.com (Serres, Doug) Date: Mon Jun 7 17:00:28 2004 Subject: Netscape XML Newsgroup References: <199804141353.GAA15267@smtp.well.com> <199804151024.DAA27483@smtp.well.com> Message-ID: <3534AB2D.37084C6@kingston.hummingbird.com> Mark Szpakowski wrote: > At 02:33 PM 4/15/98 +1000, Marcus Carr wrote: > >Are you certain of this address? I have tried sporadically over the past > seven > >hours and have not been able to connect. > > The most direct route is to point your newsreader to secnews.netscape.com, > and then subscribe to netscape.dev.xml: that works, I just tried it. > > - Mark Or try this: http://developer.netscape.com/support/newsgroups/index.html?content=/members/doc/subscriber/doc/newsgroups/xml.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at sunsite.unc.edu Wed Apr 15 15:20:27 1998 From: elharo at sunsite.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:00:28 2004 Subject: possibility of an RTF, LaTex XML conversion process In-Reply-To: <199804141220.IAA00323@unready.microstar.com> References: <3.0.5.32.19980414210105.007bd8f0@ozemail.com.au> <005501bd670b$02089520$2ee044c6@donpark> <3.0.5.32.19980414210105.007bd8f0@ozemail.com.au> Message-ID: >Baden Hughes writes: > > > Both Don Park and David Megginson mentioned this in the last few > > posts: conversion mechanisms from RTF and LaTex to XML. Anyone know > > of any work on either of these ? I'd be very interested to hear > > about it. If not, is there any other interest in these conversions > > besides mine ? > The MathML folks have considered the isuues of transforming TeX to MathML throughout the development of MathML. I'm not sure how far that's gone yet. +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@sunsite.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | Java Secrets (IDG Books 1997) | | http://www.amazon.com/exec/obidos/ISBN=0764580078/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Cafe au Lait | | http://sunsite.unc.edu/javafaq/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From robnett at cig.mot.com Wed Apr 15 16:35:42 1998 From: robnett at cig.mot.com (Scot Robnett) Date: Mon Jun 7 17:00:28 2004 Subject: Netscape XML Newsgroup References: <199804141353.GAA15267@smtp.well.com> <199804151034.FAA27767@quoth.cig.mot.com> Message-ID: <199804151444.KAA14162@po_box.cig.mot.com> I had the same problem as Marcus, and secnews.netscape.com does not work for me either. Any other suggestions? -- Scot Robnett Technical Education & Documentation Cellular Infrastructure Group Cellular Networks & Space Sector Motorola ------------------------------------------------------------------- Okay, no problem. How much money have you got? ------------------------------------------------------------------ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dserres at kingston.hummingbird.com Wed Apr 15 17:03:59 1998 From: dserres at kingston.hummingbird.com (Serres, Doug) Date: Mon Jun 7 17:00:28 2004 Subject: Netscape XML Newsgroup References: <199804141353.GAA15267@smtp.well.com> <199804151034.FAA27767@quoth.cig.mot.com> <199804151444.KAA14162@po_box.cig.mot.com> Message-ID: <3534CCDF.45996932@kingston.hummingbird.com> Scot Robnett wrote: > I had the same problem as Marcus, and secnews.netscape.com does not work > for me either. Any other suggestions? I worked for me just now! I had to be signed up as a member of dev-edge -- are you? Our firewall was set to stop all access to the secure news server port (563) and had to be set to allow these connections -- how about you? -- ======================================================================= Doug Serres (dserres@kingston.hummingbird.com), Junior Developer - R&D Business Intelligence Group, Hummingbird Communications Ltd. The important thing is not to stop questioning. - Albert Einstein ======================================================================= xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From robnett at cig.mot.com Wed Apr 15 17:55:14 1998 From: robnett at cig.mot.com (Scot Robnett) Date: Mon Jun 7 17:00:28 2004 Subject: Netscape XML Newsgroup References: <199804141353.GAA15267@smtp.well.com> <199804151034.FAA27767@quoth.cig.mot.com> <199804151444.KAA14162@po_box.cig.mot.com> <199804151516.KAA01500@quoth.cig.mot.com> Message-ID: <199804151603.MAA15621@po_box.cig.mot.com> It is probably a firewall problem. Thanks for the input. -- Scot Robnett ------------------------------------------------------------------- Okay, no problem. How much money have you got? ------------------------------------------------------------------ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Wed Apr 15 18:23:01 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:28 2004 Subject: SAX: New Idea for Entity Resolution Message-ID: <199804151621.MAA02638@unready.microstar.com> Here's a different idea for SAXEntityResolver, that would add the ability for an application to return a character stream for _any_ URI (rather than just the document root): public interface SAXEntityResolver { public abstract String filterSystemId (String publicId, String systemId); public abstract SAXCharacterStream openCharacterStream (String systemId); } Here's how this would work: 1. The SAXParser calls the resolver's filterSystemId() method to see if the system ID needs to be translated. If filterSystemId() returns null, the parser will use the default system ID; if it returns a string the will use that string as the system identifier. 2. The SAXParser calls the resolver's openCharacterStream() method to see if the application wants to open its own character stream. If openCharacterStream() returns null, then the parser will take care of opening a character stream itself; if it returns a character stream, then the parser will use that character stream for the entity. I like this approach because it does not add a new interface to SAX, because it is consistent (in both cases, a null return value means 'let the parser do it'), and because it nicely separates the functions of resolving identifiers and connecting to them. It will, of course, not be necessary for most SAX application writers to use this interface; when they do, however, they will be able to handle _any_ URI scheme, and not just URLs. On an even more interesting level, the object implementing this interface could be sitting on a remote ORB and shared by many SAX implementations (sort of like a DNS server). On the down side, I don't think that it will be possible to build this into a driver on top of any existing Java-based SAX parser, at least without minor internal modifications. Comments? All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From signell at physnet.pa.msu.edu Wed Apr 15 21:55:34 1998 From: signell at physnet.pa.msu.edu (signell@physnet.pa.msu.edu) Date: Mon Jun 7 17:00:28 2004 Subject: possibility of an RTF, LaTex XML conversion process (fwd) Message-ID: <199804151955.PAA03979@physnet.pa.msu.edu> A non-text attachment was scrubbed... Name: not available Type: text Size: 1353 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19980415/f5175366/attachment.bat From wperry at fiduciary.com Wed Apr 15 23:20:55 1998 From: wperry at fiduciary.com (W. E. Perry) Date: Mon Jun 7 17:00:28 2004 Subject: SAX: New Idea for Entity Resolution References: <199804151621.MAA02638@unready.microstar.com> Message-ID: <353523B9.F750EC9C@fiduciary.com> David Megginson wrote: > Here's a different idea for SAXEntityResolver, that would add the > ability for an application to return a character stream for _any_ URI > (rather than just the document root): Yes! Yes! Yes! And, in addition to the benefits you identify, this approach provides (of special interest in our niche): -- separation of notification that a document is available from the delivery and parsing of that document: Notice might consist of a very small SMTP message sent to dozens of interested parties. Each would apply its own criteria to determine the priority of the document and to schedule a suitable 'pickup'. Rather than be forced to have appropriate processes always available to deal with whatever documents might become available (perhaps from across a dozen time zones) at inconvenient times, the recipient may schedule processing of various document types to its own convenience and to suit the nature of its own business. And, of course, the routing to acquire the document might follow a very different (more secure, perhaps, or of greater bandwidth) path than the original notice. -- clear separation of document acquisition from document parsing and subsequent processing. (This is actually a *feature* of the inability to build SAXEntityResolver on top of existing Java-based SAX parsers.) Among other possibilities, this allows conclusions about the nature of a document to be based on when, where, and under what circumstances it becomes available Those conclusions might dictate how that document most prudently should be processed. Clearly, I am looking at the underlying XML primarily as a 'wire protocol' and the document as a message, but it is not necessary to share these preconceptions in order to appreciate the functional integrity of the proposal. Thank you. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lex at www.copsol.com Thu Apr 16 00:46:58 1998 From: lex at www.copsol.com (Alex Milowski) Date: Mon Jun 7 17:00:28 2004 Subject: SAX: New Idea for Entity Resolution In-Reply-To: <199804151621.MAA02638@unready.microstar.com> from "David Megginson" at Apr 15, 98 12:21:21 pm Message-ID: <199804152243.RAA14083@copsol.com> > Here's a different idea for SAXEntityResolver, that would add the > ability for an application to return a character stream for _any_ URI > (rather than just the document root): > > public interface SAXEntityResolver { > public abstract String filterSystemId (String publicId, String systemId); > public abstract SAXCharacterStream openCharacterStream (String systemId); > } Why not forego all this silliness of mapping system identifiers and allow users to use URN's in an intelligent way? Essentially, a URN would be used in the system identifier and you just *wouldn't* use the public identifier! Then, in your environment, orthogonal to the parser, you provide a way to resolve the URN. Since entities can't be declared with only a public identifier, public identifiers aren't very useful for interchange because I have to specify a system identifier. When I specify a system identifier, some parser might choose to use the system identifier rather than some non-standard public/system id mapping scheme. Now my document is broken from this receiver's perspective. In effect, although the above interface is useful, it reduces interchange in that I can make a document with broken system identifiers work on my system. Essentially, I can make an *invalid* document valid! Since I can't use public identifiers in XML in an intelligent way, I can't really recommend using them. Unlike *generic* SGML, in XML I can using URN's in system identifiers in an very intelligent way--the same way I used to use public identifiers in SGML! ;-) Makes you wonder why we have public identifiers in XML at all... they are rather useless in their *current* form! :( ============================================================================== R. Alexander Milowski http://www.copsol.com/ alex@copsol.com Copernican Solutions Incorporated (612) 379 - 3608 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Apr 16 03:00:10 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:28 2004 Subject: SAX: New Idea for Entity Resolution In-Reply-To: <199804152243.RAA14083@copsol.com> References: <199804151621.MAA02638@unready.microstar.com> <199804152243.RAA14083@copsol.com> Message-ID: <199804160059.UAA00368@unready.microstar.com> Alex Milowski writes: > In effect, although the above interface is useful, it reduces > interchange in that I can make a document with broken system > identifiers work on my system. Essentially, I can make an > *invalid* document valid! You can do this in any case, though -- you can intercept URIs in the system libraries (Java, for example, lets you register your own schemes), or you can redirect them with a proxy server. With URLs, file:// will almost always break on exchange, as will http: system identifiers that refer to hostnames visible only within a private network. Your other points (which I omitted above) are well taken -- public identifiers are a bit of a muddle right now, but since they're in XML 1.0, it makes sense to support them. The interface is not only for public identifiers, however -- users can also remote URIs to local/secure equivalents, and they can even screen out certain URIs if necessary. I'd better copyright "XML-Nanny" before someone else thinks of it. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lex at www.copsol.com Thu Apr 16 06:46:28 1998 From: lex at www.copsol.com (Alex Milowski) Date: Mon Jun 7 17:00:28 2004 Subject: SAX: New Idea for Entity Resolution In-Reply-To: <199804160059.UAA00368@unready.microstar.com> from "David Megginson" at Apr 15, 98 08:59:03 pm Message-ID: <199804160442.XAA14214@copsol.com> > Alex Milowski writes: > > > In effect, although the above interface is useful, it reduces > > interchange in that I can make a document with broken system > > identifiers work on my system. Essentially, I can make an > > *invalid* document valid! > > You can do this in any case, though -- you can intercept URIs in the > system libraries (Java, for example, lets you register your own > schemes), or you can redirect them with a proxy server. > > With URLs, file:// will almost always break on exchange, as will http: > system identifiers that refer to hostnames visible only within a > private network. Yes, but then if you do this, don't expect it to work elsewhere. ;-) Why would you use absolute URLs? Bad author, bad! Ok, maybe you would use them for a standard DTD. ;-) (This is where I beat the URN drum) In the SGML world, I could come up with a scheme that made location orthogonal to my documents. I *never* put a system identifier in my documents. In XML, this is much harder. Now, if URN support was *standard*, I could at least put a URN in the place of every system identifier I needed and then my document is quite portable. The key phrase here is *standard*. Of course, we could also fix public identifiers and forget about the URN stuff. ...but, then we would have to come up with yet-another-resolution-mechanism... which sounds too much like URNs. > Your other points (which I omitted above) are well taken -- public > identifiers are a bit of a muddle right now, but since they're in XML > 1.0, it makes sense to support them. The interface is not only for > public identifiers, however -- users can also remote URIs to > local/secure equivalents, and they can even screen out certain URIs if > necessary. I'd better copyright "XML-Nanny" before someone else > thinks of it. Well, a further point I was making off-line is that this kind of mapping could be lead people down the wrong road. I have run into so many SGML users over the years that didn't know how to or *couldn't* use public identifiers without system identifiers. In an SGML world, I see this as bad practice. Likewise, I see mapping system identifiers in XML as bad practice. Two general rules I can recommend: 1. Use an internal resolution system inside your production systems. Locations will change even inside your own system. 2. Use a fairly static naming system (URN/Public identifier) when you exchange documents. One thing XML has over SGML is that it is tied more closely to a location mechanism. If you add in URN ability, there is no issue of "configuring" you local system to know about mappings--you just do a URN lookup. (Obviously, URNs can be miss-configured or not available. Ever had problems on the Internet with DNS names? Same idea, same problem, same frustration when it is wrong!) ============================================================================== R. Alexander Milowski http://www.copsol.com/ alex@copsol.com Copernican Solutions Incorporated (612) 379 - 3608 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From john at totten.com Thu Apr 16 08:27:29 1998 From: john at totten.com (John Totten) Date: Mon Jun 7 17:00:28 2004 Subject: Taxonomies in XML Message-ID: <3535A5C6.2667@totten.com> The 30 or so XML files that represent the El Limon Weeds Collection (one separate file for each weed) will impress a Web Master but not a botanist because you cannot produce a taxonomy from them. How can you add nodes and unlimited nesting to XML documents? http://www.honeylocust.com/limon/xml/ John Totten CPT Inc Sitka Alaska xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Thu Apr 16 15:40:26 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:00:29 2004 Subject: Taxonomies in XML Message-ID: <3.0.32.19980416083632.006c8aa4@postoffice.swbell.net> At 10:31 PM 4/15/98 -0800, John Totten wrote: >The 30 or so XML files that represent the El Limon Weeds Collection >(one separate file for each weed) will impress a Web Master but not a >botanist because you cannot produce a taxonomy from them. > How can you add nodes and unlimited nesting to XML documents? By editing them? XML documents have no inherent nesting limit (although there will always be a practical limit imposed by your processing software). If a document does not have an explicit DTD, then, by definition, you are free to change it at will, because it defines its rules by its own content. If a document does have an explicit DTD, they, by definition, you are free to change it at will because the DTD is a property of the document--the document defines its own rules by declaring them in *its* DTD. If the DTD is an external DTD subset that you don't have write access to, just copy it into the internal subset and go on your way. [Hint to ADEPT*Editor users: try the command 'dtgen' from the ADEPT command line if someone has tried to impose a "standard" DTD on you.] You can also create taxonomies using references or hyperlinks, e.g.,: ]> Seigfried Woods Bete Noir Woods Forrest Woods ]> A catish thing A dogish thing This could also be done with extended links: Cheers, Eliot --
W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From amarshal at usc.edu Thu Apr 16 16:40:52 1998 From: amarshal at usc.edu (Andrew n marshall) Date: Mon Jun 7 17:00:29 2004 Subject: SAX: Parser Factory class Message-ID: <000401bd6946$b2199490$50e37d80@philica> While it is great that there are so many SAX parsers available, the thing that bothers me is that my application has to know what Parser to load in just to use SAX. And this means that even is a SAX parser is available on the users machine, they may still have to download the one I use to to run my application. If there is already a solution for this, then please let me know. David Megginson had mentioned the sax.parser property as a possible solution, but I would like to see the following my general solution: public class org.sax.ParserFactory() { /* static methods */ // Registers a SAX driver public static void addParserFactory(); // Retrieves all registered SAX drivers public static ParserFactory[] getParserFactorys(); // Conveince method to return a parser // from the first registered driver public static Parser createDefaultParser(); /* Instance Methods */ // Returns a new Parser public Parser createParser(); // Various identifying methods // Subject to change for what appropriate method abstract public String getDriverName(); abstract public long getDriverVersion(); abstract public long getSAXVersion(); } If the above was implemented, then the following piece of code in the ParserFactory implementation could automatically register the SAX Driver. static { org.sax.ParserFactory.addParserFactory( new MyParserFactory() ); } I think this is a relative clean and forward compatible way to handle SAX Drivers. And simplifies life for SAX based application programmers. I realize it is a bit heavy weight since it require that the Factory be an abstract class instead of an interface, but I think the benefit is worth it. Any thoughts? Andrew n marshall student - artist - programmer http://www.media-electronica.com/anm-bin/anm "Everyone a mentor, Everyone a pupil" -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19980416/4747eb03/attachment.htm From peter at ursus.demon.co.uk Thu Apr 16 17:01:43 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:29 2004 Subject: XLINK discussion In-Reply-To: <98Apr13.004317edt.26882@thicket.arbortext.com> References: <98Apr12.174657edt.26881@thicket.arbortext.com> <3.0.1.32.19980410075048.00a7a8e0@jefferson.village.virginia.edu> <3.0.1.32.19980410075048.00a7a8e0@jefferson.village.virginia.edu> Message-ID: <3.0.1.16.19980416083741.3027c442@pop3.demon.co.uk> At 00:43 13/04/98 -0400, Eve L. Maler wrote: >Steve DeRose and I do track XML-Dev and other XML-related lists for XLink >issues, but people should try to stick to XML-Dev's original mission of >discussing implementation, and send any specific XLink comments and >requests directly to us. > >Thanks, > > Eve Thanks Eve. > >At 04:58 PM 4/12/98 -0400, David Megginson wrote: >>Daniel Pitti writes: >> >> > Is there a separate list for XLink discussion? Or is xml-dev the >> > appropriate venue for now? >> There are as yet relatively few implementations of XLink and since there has been little experience with it so far it's certainly appropriate to discuss the strategy of implementation. Eliot Kimber wrote an extremely valuable contribution last year (search the archive :-) which is worth revisiting. Personally I feel that XLink is one of the most powerful areas of the XML family and some generic tools would be extremely useful. For example both XML-data and RDF might benefit from being expressed in XLink if a link processor were available. Is this reasonable? I am certainly looking to use Xlink as the first choice for expressing @relations' in science and technology - based on the hope that tools will become available. A major advantage - as with all XML - is that it is much more accessible to normal mortals than other approaches. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu Apr 16 17:07:40 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:29 2004 Subject: Taxonomies in XML In-Reply-To: <3535A5C6.2667@totten.com> Message-ID: <3.0.1.16.19980416081124.295f916a@pop3.demon.co.uk> At 22:31 15/04/98 -0800, John Totten wrote: >The 30 or so XML files that represent the El Limon Weeds Collection >(one separate file for each weed) will impress a Web Master but not a >botanist because you cannot produce a taxonomy from them. > How can you add nodes and unlimited nesting to XML documents? > If your taxonomy is fixed and consists of a single hierarchy then XML is the most natural way to express it :-). I have done this for protein sequences on the WWW and come up with something like: This allows nesting of any depth and displays beautifully in a tree-structured browser. [I shall be releasing JUMBO2 very shortly - when SAX is finalised - and this will be one of the examples to show an essentially non-textual application of XML.] [I have added whitespace to the above example for human benefit. Exercise: If you are new to XML, how would you decide whether the whitespace was 'ignorable'? :-)] Note that I dare not venture further than this because taxonomies are much more complex than this - usually dynamic and hence requiring attention to renaming, equivalences, the possibility of multiple parents, etc. Much the same problems as with orgCharts :-). But if you have a fixed taxonomy, XML is wonderful. try doing the above with a relational data and asking non-experts to create the input, whereas I suspect any scientist could work with the above almost without thinking. Why did I include the 'data' in the TITLE attribute rather than content? Mainly because I had a simple display routine that picked up TITLE attributes rather than content attributes :-). If I redid it now I might move things to element content. HTH P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu Apr 16 17:11:13 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:29 2004 Subject: SAX: Method Name Collisions In-Reply-To: <199804131412.KAA00789@unready.microstar.com> References: <352F5D2D.98570E7D@infinet.com> <199804101121.HAA00492@unready.microstar.com> <352F48D3.A69FC818@jclark.com> <199804122131.RAA00308@unready.microstar.com> <352F5D2D.98570E7D@infinet.com> Message-ID: <3.0.1.16.19980416082753.341f27e2@pop3.demon.co.uk> At 10:12 13/04/98 -0400, David Megginson wrote: > >The first, "getAttributeListLength", is the ugliest. It is simple to >avoid this problem by creating a separate class for SAXAttributeList, >rather than implementing it in the main driver -- what does everyone >else think about this question? > I've been away for some days, so ignore this if you've come to a conclusion... I have been gently struggling with the naming problem whilst developing JUMBO2 which is now based on the SwingSet classes. I though about names like jumbo.xml.Tree and so on, but rejected them in favour of a small uniquifying prefix (XTree, etc.). This is rather similar to the swing use of JTree. The likelihood of collision between Tree, Node, Element, Attribute, and a few others is very high and the result is that there may have to be lines of the form: jumbo.xml.Tree tree = new com.sun.java.swing.JTree(new jumbo.xml.Node("Root")); which is the appropriate way to uniquify them. The short names are difficult to search for - searching for Tree in all files will return a large number of unwanted hits. Therefore I'm mildly in favour of SAXFoo. I'm not passionate on this, but I also support Stroustrup's philosophy that well devised names are often better than comments. So getAttributeListLength() may possibly avoid the use of a comment. Also, where possible, anything that has the same 'feel' as the Java class libraries is a help to learning. As we all know this is a subjective matter and I'll trust David to make a good job of it. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Apr 16 20:19:23 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:29 2004 Subject: SAX: Parser Factory class In-Reply-To: <000401bd6946$b2199490$50e37d80@philica> References: <000401bd6946$b2199490$50e37d80@philica> Message-ID: <199804161818.OAA00369@unready.microstar.com> Andrew n marshall writes: > While it is great that there are so many SAX parsers available, the > thing that bothers me is that my application has to know what > Parser to load in just to use SAX. And this means that even is a > SAX parser is available on the users machine, they may still have > to download the one I use to to run my application. There might be a good reason for requiring them to download anyway -- the parser on the local machine, for example, might not include external entities. That said, I am considering including a few optional language-specific glue classes in a special package called org.xml.sax.helpers (these would not be a standard part of the SAX interface, and would vary from language to language). Among the Java helpers, I was considering the following: public class SAXParserFactory { public static SAXParser makeSAXParser () throws [a whole bunch of exceptions] { [...] } public static SAXParser makeSAXParser (String className) throws [a whole bunch of exceptions] { [...] } } The first method would attempt to load the SAX parser in the class name provided by the "sax.parser" property, and the second would attempt to load an explicitly-named SAX parser. The idea of a registry is interesting, but it creates a chicken-and-egg problem (as with the JDBC): if I haven't loaded the class yet, how can it register itself? That said, there is no reason that you or someone else could not build the more elaborate implementation if you wish -- I expect that people will build all kinds of layers on top of SAX. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Thu Apr 16 20:40:05 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:00:29 2004 Subject: XLINK discussion Message-ID: <3.0.32.19980416133705.006c71cc@postoffice.swbell.net> At 08:37 AM 4/16/98, Peter Murray-Rust wrote: >There are as yet relatively few implementations of XLink and since there >has been little experience with it so far it's certainly appropriate to >discuss the strategy of implementation. Eliot Kimber wrote an extremely >valuable contribution last year (search the archive :-) which is worth >revisiting. Thanks. I've made a first-draft stab at implementing XPointers in DSSSL (http://www.drmacro.com/hyprlink/xlink/) from which a variety of useful XLink-to-? transforms could be built. I'm also in the process of building general-purpose HyTime engine that will also implement XPointers and therefore be able to support XLink (because XLink can be defined as an application of HyTime). Not sure when I'll get the XPointer part of this system working, but probably in the next couple of months. I'll be announcing the alpha version of this tool just as soon as it does something demonstrable (I'm building it for a demo at SGML/XML Europe, so I have to have that much working by May). Cheers, Eliot --
W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Apr 17 04:31:09 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:29 2004 Subject: Announcement: SAX 1998-04-16 pre-release Message-ID: <199804170229.WAA04184@unready.microstar.com> There is a new SAX pre-release available at the following URLs: Main interfaces: http://www.microstar.com/XML/SAX/New/saxjava-19980416.zip Drivers for Lark and MSXML: http://www.microstar.com/XML/SAX/New/saxdrivers-19980416.zip Demos: http://www.microstar.com/XML/SAX/New/saxdemos-19980416.zip Once again, I am announcing this pre-release only to XML-Dev. ******* Changes ******* There are some significant changes in since the 1998-04-10 pre-release, mostly the result of discussion on XML-Dev: - stripped "SAX" prefix from all class names except SAXException (and SAXParseException) - added a new core interface org.xml.sax.CharacterStream - added a new core class SAXParseException - added three optional Java-specific classes: org.xml.sax.helpers.ParserFactory, org.xml.sax.helpers.ReaderAdapter, and org.xml.sax.helpers.CharacterStreamAdapter - reworked EntityResolver to use two methods: resolveSystemId() and openEntity() - changed parse() methods in Parser interface (InputStream removed, and Reader replaced with CharacterStream) - changed default behaviour for warnings and recoverable errors in HandlerBase to no-op - updated much documentation Thank you all for your help with the last pre-release -- I will look forward to more comments and suggestions. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Fri Apr 17 05:03:20 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:00:29 2004 Subject: SAX: Byte Stream Needed? References: <3.0.32.19980412151952.00b2e5e4@pop.intergate.bc.ca> <35315D68.BBCF0DCB@infinet.com> <3531CEFE.D90242F6@jclark.com> <199804151128.HAA00297@unready.microstar.com> Message-ID: <3536A402.4DF0249C@jclark.com> David Megginson wrote: > > James Clark writes: > > [...] > > > InputStreamReader, however, leaves something to be desired because > > it doesn't allow users to supply their own character-to-byte > > conversion routines. But if you have an InputStream you should be > > using the interface to the parser that takes an InputStream. In > > any case it's not practical to use an InputStreamReader for XML > > because that won't deal with XML's rules for detecting encodings. > > I have actually been toying with omitting the byte-stream parse() > method altogether, so that there would be only two parse methods: > > public abstract void parse (String publicId, String systemId) > throws java.lang.Exception; > > public abstract void parse (String publicId, String systemId, > SAXCharacterStream input) > throws java.lang.Exception; > > I've defined SAXCharacterStream as follows: > > public interface SAXCharacterStream { > public abstract int read () > throws SAXException; Why do you need this? > public abstract int read (char ch[], int start, int count) > throws SAXException; > } > > (Where SAXException is, in the Java version, a direct and unmodified > subclass of java.io.IOException). The result of either method is -1 > if there are no characters left to read; otherwise, it is a UTF-16 > character value for the first, and the number of characters read for > the second. > > The advantage of using SAXCharacterStream is that behaviour over CORBA > (or, I suppose, DCOM) is now well-defined. The disadvantage is > another bloody interface. > > I had also written a SAXByteStream, but then I started wondering why > we really need it -- information coming from a database, for example, > or from a buffer should already be in characters, not in raw bytes > (and in Java, at least, it is simply to wrap a Reader around any > InputStream when necessary -- I expect that other languages will have > good internationalisation support soon). > > Can anyone put forward a convincing case for having a standard SAX > method parsing from a raw byte stream (remembering that > implementations can always extend the SAXParser interface themselves > for special requirements)? You would be biasing SAX towards implementations that work internally by converting into UTF-16 and then parsing. Not all parsers work like this and it is not the most efficient way to write a parser. My parsers work directly on a stream of bytes and don't convert to a character stream first. That's one reason why they are faster than other parsers. In fact the way I would implement support for a SAXCharacterStream is to wrap an InputStream around it to turn it into a sequence of bytes. XML implementations may well provide their own machinery for converting from bytes to characters. The system provided facilties (as in Java) are in practice often slow, buggy (lacking surrogate support for example), with inconsistences between platforms. By providing only SAXCharacterStream you would be preventing users from taking advantage of this machinery when not reading from a URL. Another reason is that the XML defined mechanisms for specification of the encoding (with the encoding declaration and auto-detection of encodings) would not be available when reading from a stream. Yet another issue is that the XML spec specifies how to parse byte streams not character streams. When you try to infer from it how to parse character streams, issues arise like treatment of the byte order mark and encoding declaration which are not defined by the XML spec. I think SAX is getting way too complicated and these should be left out for now. If you are going to have only one it should be SAXByteStream not SAXCharacterStream. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Fri Apr 17 05:05:15 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:00:29 2004 Subject: SAX: New Idea for Entity Resolution References: <199804151621.MAA02638@unready.microstar.com> Message-ID: <3536AFCD.54B7E396@jclark.com> David Megginson wrote: > > Here's a different idea for SAXEntityResolver, that would add the > ability for an application to return a character stream for _any_ URI > (rather than just the document root): > > public interface SAXEntityResolver { > public abstract String filterSystemId (String publicId, String systemId); > public abstract SAXCharacterStream openCharacterStream (String systemId); > } This is fine except that it should use byte streams not character streams. What you get if you are reading from the net or from an archive or a database or whatever is bytes not characters and it is part of the function of an XML processor to manage the conversion into bytes using the encoding declaration and the XML specified mechanisms for encoding auto-detection. You could provide both, but the fundamental one is a for a stream of bytes. Also the EntityResolver needs to be able to indicate an externally specified encoding (as with the additional argument for parse with a SAXByteStream). In other words SAXEntityResolver needs to return an object with two members: a SAXByteStream and a (possibly null) String. Note that given this I can trivially implement openCharacterStream byte implementing a SAXByteStream that encodes UTF-16 characters as UTF-16 bytes and specifies UTF-16 as the externally specified encoding. The converse is absolutely not the case: in order to do the converse I would have to provide machinery for parsing the XML declaration and for managing encoding conversions. This is basically what I do in XP: /** * This interface is used by the parser to access external entities. * @see Parser * @version $Revision: 1.4 $ $Date: 1998/02/17 04:20:32 $ */ public interface EntityManager { /** * Opens an external entity. * @param systemId the system identifier specified in the entity declaration * @param baseURL the base URL relative to which the system identifier * should be resolved; null if no base URL is available * @param publicId the public identifier specified in the entity declaration; * null if no public identifier was specified */ OpenEntity open(String systemId, URL baseURL, String publicId) throws IOException; } /** * Information about an open external entity. * This is used to by EntityManager to return * information about an external entity that is has opened. * @see EntityManager * @version $Revision: 1.4 $ $Date: 1998/02/17 04:20:47 $ */ public class OpenEntity { private InputStream inputStream; private String encoding; private URL base; private String location; /** * Creates and initializes an OpenEntity which uses * an externally specified encoding. */ public OpenEntity(InputStream inputStream, String location, URL base, String encoding) { this.inputStream = inputStream; this.location = location; this.base = base; this.encoding = encoding; } /** * Creates and initializes an OpenEntity which uses * the encoding specified in the entity. */ public OpenEntity(InputStream inputStream, String location, URL base) { this(inputStream, location, base, null); } /** * Returns an InputStream containing the entity's bytes. * If this is called more than once on the same * OpenEntity, it will return the same InputStream. */ public final InputStream getInputStream() { return inputStream; } /** * Returns the name of the encoding to be used to convert the entity's * bytes into characters, or null if this should be determined from * the entity itself using XML's rules. */ public final String getEncoding() { return encoding; } /** * Returns the URL to use as the base URL for resolving relative URLs * contained in the entity. */ public final URL getBase() { return base; } /** * Returns a string representation of the location of the entity * suitable for use in error messages. */ public final String getLocation() { return location; } } James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Apr 17 05:58:43 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:30 2004 Subject: Problems parsing XML Message-ID: <3.0.32.19980416182640.007448a4@pop.intergate.bc.ca> At 10:35 PM 14/04/98 -0500, len bullard wrote: >> [Chris Maden :] >> > One fundamental flaw in _XML Complete_ is Holzner's apparent belief >> > that you must write Java code in order to do anything useful with >> > XML. >Markup doesn't care. That's the beauty of it. :- Yes! What he said. As a result of having been a programmer since A.D. 1979, my faith in interoperable APIs is torn and shredded. But I think that interoperable syntax is usefully achievable. Hence, XML. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elm at arbortext.com Fri Apr 17 06:07:39 1998 From: elm at arbortext.com (Eve L. Maler) Date: Mon Jun 7 17:00:30 2004 Subject: XLINK discussion In-Reply-To: <98Apr16.110049edt.26881@thicket.arbortext.com> References: <98Apr13.004317edt.26882@thicket.arbortext.com> <98Apr12.174657edt.26881@thicket.arbortext.com> <3.0.1.32.19980410075048.00a7a8e0@jefferson.village.virginia.edu> <3.0.1.32.19980410075048.00a7a8e0@jefferson.village.virginia.edu> Message-ID: <98Apr17.000530edt.26882@thicket.arbortext.com> At 04:37 AM 4/16/98 -0400, Peter Murray-Rust wrote: ... >There are as yet relatively few implementations of XLink and since there >has been little experience with it so far it's certainly appropriate to >discuss the strategy of implementation. Eliot Kimber wrote an extremely >valuable contribution last year (search the archive :-) which is worth >revisiting. > >Personally I feel that XLink is one of the most powerful areas of the XML >family and some generic tools would be extremely useful. For example both >XML-data and RDF might benefit from being expressed in XLink if a link >processor were available. Is this reasonable? > >I am certainly looking to use Xlink as the first choice for expressing >@relations' in science and technology - based on the hope that tools will >become available. A major advantage - as with all XML - is that it is much >more accessible to normal mortals than other approaches. I think it's reasonable for XML-Data, RDF, and any other "application of XML" to use XLink if it needs to contain linking elements. RDF, at least, already plans to use XLink for references to resources. I'm starting to see a lot of small, unprepossessing implementations of XLink and XPointer all over the place. I doubt any of these are wholly conforming yet (and the specs are far too new for this yet!), but it's encouraging to see that implementation is not only not impossible, but reasonable to accomplish. Eve xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Fri Apr 17 07:50:14 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:00:30 2004 Subject: Character Stream vs. Byte Stream proposal... Message-ID: <3536EE79.9CCD408D@infinet.com> Why not simply have a standard factory that takes any type of InputStream (UTF-16, UTF-8, etc) similiar to how the parse method works and it returns a type (say CharacterStream) which can then be passed to either the parser or the application. In this case the implementations for doing all of this low level character reading from bytes could be standardized for each platform. This way you could have a lot of different parsers that don't have redundant character converting implementations in the parsers that as I have seen add to almost 50% code bloat in some instances. Yes this would mean a concrete implementation for all of these types of streams in a CharacterStream factory would have to be agreed upon for each language, but I feel this is absolutely essential to SAX as it makes writing parsers a ton easier since you don't have to worry about very low-level encoding formats that can take years to learn. Java would not be successful at all if all the low level stuff was simply defined as interfaces and not as concrete implementations. If XML adds or removes encoding formats or other low-level specifications in the future), many parser writers may not have the time or expertise to redo everything all the time. The closest analogy I can think of is if everyone had to write their own Java version of System.arraycopy(). Having about 5 billion different byte to character translation implementations would be akin to having 5 billion java.util.Vector implementations. Nevertheless, the standard factory could be represented as an interface so that parsers which absolutely need to do their own byte to character translation implementations for the parser could do so. The closest analogy I can think of to this is the pluggable sockets framework in JDK 1.1 and beyond. Any ideas. I don't want to see SAX turn into an interface explosion, nor do I feel all parsers should do the most redundant activities possible at the I/O level. Last but not least, some parsers (such as the one I have) could of course benefit immensely by having a concrete default implementation for these character streams as for people like me, low-level byte to character I/O is not my personal forte. The parser I have written uses its own proprietary XML Object framework which I feel is more efficient in some respects for modeling data in Java than an event based parser like SAX. It is non-validating right at the moment (unfinished), and it seems to parse 200% faster than Aelfred right now for my documents which was a huge surprise - 220 milliseconds parsing vs. Aelfred's 459 milliseconds after several tests. Spitting out XML data in a tree like form took under consistently 20 milliconds. Please take these numbers with a grain of salt as the parser is currently pretty much non-validating as well as the fact that the XML documents were not large enough I feel to do any true comparison. The main goal of the framework was to eliminate the common if-then-else handling in an event based parser which may be part of the speed increase. Simply having a fast parser I feel is not useful if the way it spits out data to applications requires signigicant overhead to handle. This approach I feel has significant advantages to event based parsing, however it also has significant drawbacks as well that are hard to elaborate on unless I go in depth about how the parser works. For the actual application I have modeling data in an event based way has maintenance problems and the Element factory concept of parsers like MSXML I feel are very resource hungry since they essentially construct I symbolic tree at runtime (at least that is my understanding). I would of preferred not to have to do any XML parser writing at all, but I just felt that for my particular application, event-based parsing, or a parser that represents elements as an XML tree, I feel were flawed in design for the needs my application has. I would make the XML Object Framework free since its design is totally removed from the application itself and we will never actually try and make money off of, but the startup I am with is in the process of incorporating itself and until that happens I cannot just hand out stuff for free for legal reasons other than under my personal name (-: For those interested, it handles both input and output of XML data in a very similiar method to how Object serialization works in Java. In fact the application I am developing needs to represent its content in both formats for various technical and political reasons. Oh well enough of the self-aggrandizement... In summary, I think this would immensely help out all parser writers, not just the ones who have event based parsers as it would significantly reduce code bloat for SAX parsers (and therefore the applications) as well as allow all parsers to use an efficient default byte to character factory rather than have to muddy themselves with bit shifting of octets. If there is even a dream of dynamically loading various parsers at runtime, I think it should be a priority to eliminate as many possible redundancies between parsers as possible, not just for the parser writers sake, but the actual people who use XML in their applications. Byte to character encoding via a default factory interface (with a default implementation that comes with SAX) I think would be a good start. Tyler P.S. - My comments about my parser in comparison to Aelfred are in no ways meant as a challenge to Aelfred at all as I have the greatest respect for David. In fact, my parser in the end with validation and such may in fact be much more inefficient than Aelfred or any other major parser. I guess when I finally finish it up, I will be able to see what the true results are. Nevertheless, I think my approach will significantly improve performance by the application in handling XML documents even if the parser itself is inefficient. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthew at praxis.cz Fri Apr 17 12:15:16 1998 From: matthew at praxis.cz (Matthew Gertner) Date: Mon Jun 7 17:00:30 2004 Subject: Inheritance in XML (was Re: Problems parsing XML) Message-ID: <01bd69e9$44dc8100$020b0ac0@xerius> I can't resist jumping in at this point, since it reminds me of some thoughts I had about a topic which was being discussed a couple of weeks ago: inheritance in XML. Unfortunately I seem to have managed to lose the original mails (hint: never remove anything from the mail server if you're not at your primary machine), but the gist was that object-oriented approaches to inheritance are not applicable to XML because XML, unlike OO languages, models only data and not behavior. This led into a very interesting and apt discussion of the difference between inheritance (i.e. of behavior) and subtyping (e.g. of interfaces). To say that OO techniques only apply to behavior is an oversimplication. Some of the basic tenets of OO (encapsulation, polymorphism) are only applicable when behavior is modelled, but I would maintain that others (inheritance, identity) are equally applicable to data. The two last examples would both be of huge benefit to XML and are both currently lacking. Eliot Kimber indicated some scepticism as to whether OO techniques have really lived up to their hype. In terms of a controlled environment, they have. Any C programmer who has moved onto C++ will attest that OO features make it far easier to write extensible and maintainable code. On the other hand, the promise that this would lead to interchangeable components that could be used anywhere has clearly been a flop. Why? For exactly the reason Tim mentioned in his mail: interoperable APIs never work. You can't interface with code and expect this interface to apply to any environment other than the one it was specifically designed for. This is the case whatever technology you are using (DLLs, Java, JavaBeans, Smalltalk, COM, CORBA, etc.). Hence XML. Nevertheless, inheritance of some sort is absolutely vital if XML is to fulfill its promise. If we can't produce standard DTDs which can be extended, *without* modifying the base DTD, then many of the advantages of XML go out the window. This is as important as, say, linking facilities, and is certainly orthogonal to the current namespace proposal. I have been giving quite a lot of thought to how inheritance (I don't really think sub-typing is the right term) could be implemented for XML. I'll have to write up the details in a seperate document, as this mail is getting pretty long. In essence: 1) HyTime provides an extremely valuable and rich basis for this work, just as it has for XML-Link. However, the relevant aspects need to be extracted and presented in a more easily digestible form. Also, HyTime attempts to implement inheritance (of element content) without extending the DTD syntax. This decision should at least be reevaluated in the context of XML. 2) OO languages provide extensive facilities for inheritance of data members (quite independently of methods), and these concepts would also be very valuable in this context. 3) Additional thought must be given to adapting the content model of existing element types in a base DTD without having to write out a whole new content model. This is pretty scary, but I imagine it would be possible to define primitives saying things like: a) certain new element types can be inserted in front of the existing content model. b) certain new element types can be appended at the end of the existing content model. c) certain new element types can be inserted at a given location in the existing content model. d) etc. I'd be really interested in reading others thoughts on this matter. Cheers, Matthew -----Original Message----- From: Tim Bray To: xml-dev@ic.ac.uk Date: Friday, April 17, 1998 6:07 AM Subject: Re: Problems parsing XML >At 10:35 PM 14/04/98 -0500, len bullard wrote: >>> [Chris Maden :] >>> > One fundamental flaw in _XML Complete_ is Holzner's apparent belief >>> > that you must write Java code in order to do anything useful with >>> > XML. > >>Markup doesn't care. That's the beauty of it. :- > >Yes! What he said. As a result of having been a programmer since >A.D. 1979, my faith in interoperable APIs is torn and shredded. >But I think that interoperable syntax is usefully achievable. >Hence, XML. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From hb at ix.heise.de Fri Apr 17 12:33:06 1998 From: hb at ix.heise.de (Henning Behme) Date: Mon Jun 7 17:00:30 2004 Subject: Examples of Mozilla's XML/CSS capabilities References: <01bd69e9$44dc8100$020b0ac0@xerius> Message-ID: <35372F90.9C6416C9@ix.heise.de> Hi, some [more?] examples to show Mozilla's capabilities to render XML data using CSS2. http://www.mintert.com/xml/mozilla/ http://www.heise.de/ix/raven/Web/xml/tl0/w3-conf.xml What isn't working (yet?) in one of the samples at the second site is to show the values of an element's attribute... Best regards, Henning Behme iX - Magazin fuer professionelle Informationstechnik Helstorfer Str. 7 * 30625 Hannover * Germany http://www.heise.de/ix/ * +49 511 5352-374 * -361 (Fax) ------ White, adj. and n. Black (Ambrose Bierce) ------ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Fri Apr 17 12:46:17 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:00:30 2004 Subject: Inheritance in XML (was Re: Problems parsing XML) Message-ID: <004f01bd69ee$1b69e6a0$1e09e391@mhklaptop.bra01.icl.co.uk> Matthew Gertner: >Some of the basic tenets of OO (encapsulation, polymorphism) are only >applicable when behavior is modelled, but I would maintain that others >(inheritance, identity) are equally applicable to data. The two last >examples would both be of huge benefit to XML and are both currently >lacking. I agree absolutely. I have found identity and subtyping to be the two biggest benefits in using an object database over a relational database. >Nevertheless, inheritance of some sort is absolutely vital if XML is to >fulfill its promise. If we can't produce standard DTDs which can be >extended, *without* modifying the base DTD, then many of the advantages of >XML go out the window. I agree that this is central. Let's leave identity out of the discussion, as that does, I think, fall into the XML Linking domain, and concentrate on what I prefer to call subtyping. There's a lot of stuff in the SGML culture that one could fall back on: architectural forms etc, but I for one find it extremely arcane and difficult to relate to my own domain of object modelling and database design, which I think is familiar to a much wider community. I know some people will disagree, but the way I use XML, a DTD is a schema, an element definition in a DTD is a class, a document is a database, and an element within a document is an instance of a class. What is missing is that we can't define one class (element type) as a subtype of another. Since we are only concerned with structural subtyping and not with behaviour, I don't think it would actually be difficult to define this concept. The main thing that's tricky is that you can get the "is-a" the wrong way round. If a PREFACE is-a-kind-of CHAPTER, that means you can find anything (elements, attributes) in a PREFACE that you can find in a chapter, and more besides. It also means you can reduce a PREFACE to a CHAPTER by removing these extra bits. I'm not entirely sure what "removing the extra bits" means: for example should it remove elements that cannot occur in a CHAPTER, or should it just remove the tags that surround those elements? This tends to show up the lack of semantics in the object model underlying XML. Just some thoughts... Mike Kay, ICL xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Apr 17 13:24:35 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:30 2004 Subject: SAX: Character Stream vs. Byte Stream proposal... In-Reply-To: <3536EE79.9CCD408D@infinet.com> References: <3536EE79.9CCD408D@infinet.com> Message-ID: <199804171123.HAA00395@unready.microstar.com> Tyler Baker writes: > Why not simply have a standard factory that takes any type of > InputStream (UTF-16, UTF-8, etc) similiar to how the parse method > works and it returns a type (say CharacterStream) which can then be > passed to either the parser or the application. In this case the > implementations for doing all of this low level character reading > from bytes could be standardized for each platform. The problem is that SAX is an API, not an architecture -- that is, it attempts to impose the fewest possible constraints on implementations. There are several good reasons for this approach: 1. SAX is one of (possibly) many APIs that an XML parser will implement, and other APIs may make conflicting demands. 2. XML parsers need to compete on speed, memory usage, etc., and to do so, they need to be free to take different approaches. Right now, there are already four major constraints that SAX imposes on an XML parser writer (other than the constraints already imposed by XML 1.0): - it must be able to report basic parsing events (start/end of elements, etc.) - it must be able to take input from a character stream - it must be able to report errors to a handler without automatically throwing an exception - it must be able to call a resolver before opening external entities The first two constraints are quite reasonable; the third and fourth may already be somewhat objectionable, and the fourth, in particular, requires modifications to existing parsers. Note that it is _not_ a requirement that the parser support localisation of error messages, that it be able to report attribute types (other than "CDATA"), that it actually use anything in DTDHandler, or that it actually provide a Locator object. All the best, and thanks for the comments, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Apr 17 13:45:25 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:30 2004 Subject: SAX: New Idea for Entity Resolution In-Reply-To: <3536AFCD.54B7E396@jclark.com> References: <199804151621.MAA02638@unready.microstar.com> <3536AFCD.54B7E396@jclark.com> Message-ID: <199804171143.HAA00485@unready.microstar.com> James Clark writes: [my example omitted] > This is fine except that it should use byte streams not character > streams. What you get if you are reading from the net or from an > archive or a database or whatever is bytes not characters and it is > part of the function of an XML processor to manage the conversion > into bytes using the encoding declaration and the XML specified > mechanisms for encoding auto-detection. You could provide both, > but the fundamental one is a for a stream of bytes. Also the > EntityResolver needs to be able to indicate an externally specified > encoding (as with the additional argument for parse with a > SAXByteStream). In other words SAXEntityResolver needs to return > an object with two members: a SAXByteStream and a (possibly null) > String. I hope that people will at least admire my wisdom if I admit that I am not smart enough to figure this one out myself. I suspect that this will be the Last Great Issue with SAX before we can finalise it, so help will be appreciated. Here are what seem to me to be the costs and benefits of supporting character streams, byte streams, or both: * Character streams only Pro: - the application writer has specialised knowledge about the information source that the parser writer lacks; as a result, the application writer can better optimise the conversion, if necessary - information from dialogue boxes, internal buffers, and (eventually, with internationalisation) databases will all be characters rather than bytes - most programming languages are moving towards characters and away from processing raw bytes - many programming languages (such as Java) already have standard methods for converting byte streams to character streams, and application writers can use these if needed or desired Con: - the application may have to convert from bytes to characters itself if an input source is not available - the parser may have its own, internal, efficient mechanism for byte-stream conversion * Byte streams only Pro: - supports the minimum common denominator: all platforms have some concept of a byte stream - allows parsers to use their own, efficient, internal methods for byte-stream conversion Con: - adds serious inefficiencies, since characters (say, from a dialog box, an internal buffer, or a database with I18N support) will have to be decomposed back into bytes to be passed to the parser, then reassembled back into characters by the parser - requires a new SAX class encapsulating a ByteStream and its recommended encoding * Both Byte and Character streams Pro: - keeps everyone happy Con: - requires more interfaces - requires another method in the Parser interface - requires a new SAX class encapsulating a ByteStream and its recommended encoding (or perhaps the ByteStream interface will have a getEncoding() method) - will greatly complicate the EntityResolver mechanism (the application will need to be able to return a byte stream _or_ a character stream -- how could I handle this?) Thanks, and all the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Fri Apr 17 14:05:37 1998 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:00:30 2004 Subject: SAX: New Idea for Entity Resolution In-Reply-To: <199804171143.HAA00485@unready.microstar.com> References: <3536AFCD.54B7E396@jclark.com> <199804151621.MAA02638@unready.microstar.com> <3536AFCD.54B7E396@jclark.com> Message-ID: <3.0.1.32.19980417140007.0069fa90@ifi.uio.no> * David Megginson > >Here are what seem to me to be the costs and benefits of supporting >character streams, byte streams, or both: > > [pros and cons deleted] What about using Tyler Bakers proposal with a SAXStreamFactory >* Character streams only > > Pro: - the application writer has specialised knowledge about the > information source that the parser writer lacks; as a > result, the application writer can better optimise the > conversion, if necessary > - information from dialogue boxes, internal buffers, and > (eventually, with internationalisation) databases will all be > characters rather than bytes > - most programming languages are moving towards characters and > away from processing raw bytes > - many programming languages (such as Java) already have > standard methods for converting byte streams to character > streams, and application writers can use these if needed or > desired > > Con: - the application may have to convert from bytes to characters > itself if an input source is not available > - the parser may have its own, internal, efficient mechanism > for byte-stream conversion > > >* Byte streams only > > Pro: - supports the minimum common denominator: all platforms have > some concept of a byte stream > - allows parsers to use their own, efficient, internal methods > for byte-stream conversion > > Con: - adds serious inefficiencies, since characters (say, from a > dialog box, an internal buffer, or a database with I18N > support) will have to be decomposed back into bytes to be > passed to the parser, then reassembled back into characters > by the parser > - requires a new SAX class encapsulating a ByteStream and its > recommended encoding > > >* Both Byte and Character streams > > Pro: - keeps everyone happy > > Con: - requires more interfaces > - requires another method in the Parser interface > - requires a new SAX class encapsulating a ByteStream and its > recommended encoding (or perhaps the ByteStream interface > will have a getEncoding() method) > - will greatly complicate the EntityResolver mechanism (the > application will need to be able to return a byte stream _or_ > a character stream -- how could I handle this?) > > >Thanks, and all the best, > > >David > >-- >David Megginson ak117@freenet.carleton.ca >Microstar Software Ltd. dmeggins@microstar.com > http://home.sprynet.com/sprynet/dmeggins/ > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri Apr 17 14:40:28 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:30 2004 Subject: Mixed content In-Reply-To: <3529EEBE.7CE8245@era-t.ericsson.se> Message-ID: <3.0.1.16.19980417123823.26cf36b8@pop3.demon.co.uk> At 11:15 07/04/98 +0200, Leif Jonsson wrote: >Hello! Hello Leif - great to hear from you. > >My name is Leif Jonsson and I am a computer scientist student from >Sweden. > >I have two basic questions, and I apologize if they are very obvious, They are not 'obvious' - they get asked frequently and answered frequently :-) >but I haven't >found a clear answer to it in any FAQ so I ask in this forum. Since no-one else has answered, I'll try (I've been away for a few days ...) > >1. Why are the restrictions in MIXED CONTENT as they are, that is, no >order > between the elements are allowed and you can not specify the number >of times > they are to appear? This is a deliberate decision of the XML-WG - in their balance between simplicity and power. SGML allows much more varied and powerful content models but they are harder to use. One particular combination of mixed content is called 'pernicious mixed content' and bites almost all newcomers to SGML. The designers took care that pernicious mixed content couldn't occur in XML, and that would prevent millions of new, excited, XML users getting switched off in the first few days. Other decisions were that simple-minded hackers (like me) cannot write validating parsers for some of the really complex content models (that is why the & connector is disallowed - (A & B) means exactly one A and exactly one B, but the order doesn't matter). > >2. With the question above in mind, I assume there are some theoretical >motivation > to these restrictions, but how are you then supposed to achieve the >above desired > effects without adding "wrapper elements"? Is there a restructuring >you can do or? This is part of the trade-off. The only mixed content construct with #PCDATA is: (#PCDATA|A|B)* This means that if you want to restrict the order or the cardinality you either have to create new elements (wrappers) or require the validation be outside the DTD - e.g. in the application. Since some of the validation is outside the parser anyway (see XLL specification, for example), we think that we can live with this. > >Again I apologize if these are stupid questions, but at least then they >should have short >answers. =) The SGML community welcomes ignorant questions (ignorance is not a crime). They treated me extremely gently and kindly when I asked these sort of things 3 years ago. P. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri Apr 17 14:42:00 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:30 2004 Subject: JUMBO2 as a learning resource In-Reply-To: <3.0.1.32.19980415155018.007db100@gestalt-sys.com> Message-ID: <3.0.1.16.19980417120855.26cf54bc@pop3.demon.co.uk> [crossposted to XML-DEV and XML-L] At 15:50 15/04/98 -0400, Jeff Zwick (on XML-L) wrote: >I design Internet training courseware and my company is thinking about >creating a training course on XML. My question to the list is what books >you recommend on XML. I'm going to use the books as a basis for the >course. I have experience with HTML, but not XML, which is also a >description of the types of students who would attend this type of >training. They will have fairly substantial skills with HTML and want to >go to the next level. So, I'm looking for books that give a good overview >of XML (structure of XML documents, good uses for XML on the Internet, >etc.) and relate it to HTML. Though not strictly a 'book' I am putting the final touches to an interactive CDROM/WWW-based tutorial package on XML, based on the current standards. This is a 'learning by doing' approach, accompanied by some textual material to add background. I shall make a full announcement shortly (on this list and XML-DEV) but am waiting until the final revisions have been made to SAX by David Megginson. The general contents will be: - the current specs in HTML (and XML where available) - the current SAX distribution - AElfred and Lark parsers (SAX-compliant). I would be delighted to add other parsers from other authors if they wish. - JUMBO2 (a completely rewritten version of JUMBO, using the SwingSet (JFC) from SUN/Javasoft) - Shakespeare in XML - graded examples of XML documents covering a wide range of applications. - self-paced tutorial material The resources will be useful for non-programmers in that there are a number of simple tools for exploring XML documents (searching, etc.) and simple authoring facilities. The emphasis will be on understanding how XML works (not how to use a particular tool). Because XSL is not finalised there is deliberately no stylesheet implementation, and simple ideas of style will be done under manual control. Although XLink is not finalised, aspects of xml:link="simple" will be supported and a subset of Xpointers. In addition this will be supported by a virtual learning environment from the Globewide Network Academy ('The virtual University of the Internet') at http://www.gnacademy.org [1] This will be hypermail (and possibly MOOs if there is demand) and will include online help (and bug fixes :-). The philosophy is: All JUMBO2 material including source will be fully freely available on the WWW under appropriate public license. JUMBO2 is being offered to the XML community for communal exploration of the specs and for building prototype applications (especially, but not exclusively in science/technology and education). A key feature of JUMBO2 is that I always attempt to track the specs precisely at appropriate times. [JUMBO was developed primarily to provide feedback to the XML-WG on possible implementation concerns, and JUMBO2 will continue in this tradition, e.g. on Xlink]. The JUMBO/CDROM provides a working one-stop package of current core XML implementations. I am extremely grateful to those who have made the material available to me. We hope to distribute it through e-outlets such as online bookstores. I hope that this package will be a useful counterpart to traditional cellulose/carbon technology and also could be a useful package for training courses such as the one that Jeff is suggesting. Key features will be simplicity, flexibility (JUMBO2 has several ways of displaying XML content, both element- and mixed) and adherence to the specs. Details are not final but we would expect to offer this in the same sort of way as material that supports other public license products, e.g. Linux, tcl, LaTeX, all of which are freely available but where the convenience of CDROM-based material can be valuable in many instances. We'd be particularly interested in anyone who would like general e-training resources in this area. We'd be interested in feedback. P + GNA. [1] the Globewide Network Academy is a non-profit volunteer-based global organisation devoted to developing the use of the Internet for education. I am personally extremely grateful to many virtual colleagues in the GNA. In particular they have given me the vision to get the VSMS off the ground. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri Apr 17 14:42:07 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:30 2004 Subject: Handling unknown elements? In-Reply-To: <352BFE1D.32BF4F27@infinet.com> Message-ID: <3.0.1.16.19980417123721.27af549c@pop3.demon.co.uk> At 18:45 08/04/98 -0400, Tyler Baker wrote: >One dilemma I have been trying to figure out with XML is the problem of >handling unknown element types and what to do with their children. [...] > >Anyone here got any better ideas on this? Well I have some ideas ... :-) The problem I address (in JUMBO2) is " "what do I do when someone sends me an XML document without any/enough accompanying material telling me what to do with it?" If this is similar to your problem, read on :-) (1) If the DTD is present it can tell you if the document is valid. There is no agreed mechanism whereby a DTD can carry additional semantics. So your DTD could tell you if a B element can contain mixed content including an I element - it can't tell you what they mean. (2) There is no universal generic mechanism for adding semantics to an XML document. (3) If the main purpose of the document is to be rendered for humans, then stylesheets should be used. If the author creates their own tagset and doesn't provide a stylesheet, many XML-aficionados will give up at this stage. i.e. a document: This is a bold italic phrase is as valid as B and I, but the reader has to do some detective work. They'd probably give up on most. (4) If the main purpose of the document is for a machine to act upon it (and not everyone realises the enormous potential of XML here), then another way of communicating semantics has to be provided. The method I use is to map Java classes onto elements. This can use a wide degree of context-dependence and can be very powerful. Example: ... will draw a chemical line drawing. ... will draw a rotatable 3-D molecule. The JUMBO-MOL software is (obviously) application-specific and uses XPointers extensively to decide on context. (5) To help with the first three problems JUMBO2 now has to following *generic* facilities which help with 'unstyled' random XML documents - search the document for all elements, attributes, attribute values, and PCDATA content and uniquify them - display this as a tree showing unique markup components. This is linked to the original document (tree). Thus, I may find that occurs in rec.xml. What does it mean? I can use JUMBO2 to find all the occurrences of in the doc and highlight them all (almost instantaneous , now :-) - find all 'whitespace' elements and delete them. This aids tree navigation in some cases - display the content of any node (whether mixed or element) in several different styles. These include: raw XML untagged event stream (e.g. similar to removal of unknown tags) prettyprinted XML (indented) whitespace specifically highlighted 'default' styling. The default styling applies simple heuristics to display elements. Thus MACBETH is displayed as: SPEAKER: MACBETH where the markup term is in a different font. This is useful for may generic XML documents. In addition JUMBO will allow you to add your own style to individual elements. Thus in rec.xml would appear to be a list, so the user can interactively add list-formatting to it. In your case you could arrange that was made bold and was made italic. [I am not prepared to 'guess' the meaning of common tags - e.g. - and the reader has to take the responsibility for this. I would hope that the world might converge towards common semantics for common terms, and XML-DEV is here if anyone wishes. But if you want to use for a chemical term rather than a paragraph, you're perfectly welcome to - XML doesn't care :-)]. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Apr 17 15:41:22 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:31 2004 Subject: XML Semantics (was Re: Handling unknown elements?) In-Reply-To: <3.0.1.16.19980417123721.27af549c@pop3.demon.co.uk> References: <352BFE1D.32BF4F27@infinet.com> <3.0.1.16.19980417123721.27af549c@pop3.demon.co.uk> Message-ID: <199804171339.JAA00394@unready.microstar.com> Peter Murray-Rust writes: > (2) There is no universal generic mechanism for adding semantics to > an XML document. Peter is correct; however, there do exist two mechanisms for associating parts of an XML document whose semantics are not known with things whose semantics are (or might be) known. The namespaces proposal from the W3C XML working group allows globalisation of element type names and attribute names: that means that it is possible to know that an element named "p" in one document type and an element named "p" in another document type share the same name intentionally. Any further implications are currently undefined. Architectural forms, which are an ISO standard, take this idea even further: with architectural forms, an application can determine that an element type named "para" in one document type is related to (or "derived from") an element type named "p" in another document type, and may make processing assumptions based on the 'a-kind-of' relationship. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From kendal at interlog.com Fri Apr 17 16:25:06 1998 From: kendal at interlog.com (Rolande Kendal) Date: Mon Jun 7 17:00:31 2004 Subject: XML toolkit recommendations please Message-ID: <3.0.32.19980417102307.013b07b0@interlog.com> Hi, I am sure many of you keep abreast of XML tools. I have developed a Java Web forum environment, and I am seeking the tools to integrate XML into it. Freedom for commercial deployment is a must. Any recommendations would be of interest. Thanks, Rolande Kendal http://www.interlog.com/~kendal xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Fri Apr 17 17:21:41 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:00:31 2004 Subject: Inheritance in XML (was Re: Problems parsing XML) Message-ID: <3.0.32.19980417101552.0071fd04@postoffice.swbell.net> At 12:12 PM 4/17/98 +0200, Matthew Gertner wrote: >1) HyTime provides an extremely valuable and rich basis for this work, just >as it has for XML-Link. However, the relevant aspects need to be extracted >and presented in a more easily digestible form. Also, HyTime attempts to >implement inheritance (of element content) without extending the DTD syntax. >This decision should at least be reevaluated in the context of XML. I appreciate the vote of confidence for architectures and hesitate to make the next comment. However, there appears to be a general misconception about architectures that I feel I must attempt to correct, to wit, that architectures have ANYTHING to do with inheritance. Mathew says "HyTime attempts to implement inheritance (of element content) without extending the DTD syntax". This is a false statement because HyTime DOES NOT ATTEMPT to define any form of inheritance as I understand that word. Therefore, it is not a failing of the AFDR that it did not extend DTD syntax (which was never a realistic option at the time it was designed). The decision that was made was the only possible decision at the time. This is not to say that I object to the idea of true inheritance in SGML. I do not. It would almost certainly be a useful facility, making the use of architectures at least easier, if not more powerful as well. So I appreciate the depth of thought that is being and will be put into this issue. I simply object to the suggestion that there is anything wrong with architectures as they stand because they fail to provide a proper or useful inheritance mechanism. Architectures cannot fail at something they explicitly don't try to do. I don't want people to think that they shouldn't use architectures because they don't do inheritance. Architectures aren't about inheritance--they are orthoganal but synergistic concepts. The *processing effect* of using architectures may *appear* to be inheritance, but that is a side effect of the type of processing that architectures enable, not a direct intent of the architectural mechanism. Or, said another way, architectures were designed to enable object-oriented *processing* but not object-oriented construction of instance DTDs for enabling parsing and validation. The latter simply isn't a requirement for the former and is orders of magnitude harder to invent, specify, and implement. Remember: DTDs exist for exactly two reasons: 1. To enable *syntactic* validation of instances 2. To enable the use of markup minization features For all other types of processing DTDs are *irrelevant*. Thus, you do not need to think about DTDs at all in order to enable object-oriented *processing*, which is one of the things architectures do. Architectures also enable the syntactic validation of documents against the architectural syntax rules (the architectural DTD), but they do not need to provide an "DTD inheritance" mechanism in order to do that--they simply need to enable the automatic generation of new instances that conform to the architectural DTD. This is a pretty trivial thing to define and implement (modulo the optional automapping facility, which, like any markup minimization feature, complicates things a bit). It might help to understand why architectures are designed the way they are. Architectures are designed to give you a way to define a set of general rules for processing documents for some specific purpose (e.g., hyperlinking, defining metadata, etc.). Document instances use these rules by reference by asserting derivation from the architecture and conformance to its rules. Because SGML can only talk about syntactic rules and because the architecture mechanism uses SGML syntax as the base definition of its rules, these sets of rules provide an ability to define syntactic constraints in way that is similar or identical to those provided by a document's private DOCTYPE declaration. At the same time, these rules do not impose any requirements on the names used in instances, because avoidance of name-space incursion is a basic principle of SGML and its related standards. Thus, a general set of rules define a set of types that instances assert conformance to, rather than defining the instance types directly. Note that architectures presume additional definitions beyond the architectural DTD but cannot, of course, define how these rules might be specified (because there are an infinite number of useful ways to do so). Note that the direction of pointing is from instances to types to establish an is-a or kind-of relationship. This is merely an *assertion* made by element *instances* (not types). This means that there is no, I repeat, no connection between element type declarations and architectural types ("forms"), except that the markup minization feature of fixed attributes lets you fix the mapping for instances at part of an element type declaration. But it is not meaningful to say that an element type conforms to an architectural form--only instances can conform. This further suggests that what architectures do is not inheritance because instances do not inherit properties from other types, they are simply instances of types. Architectures do not define any notion of types being derived from types. [The derivation of one architecture from another is really the derivation of architectural *instances* from another architecture, not derivation of the architecture. This truth is obscured by the fact that architectural instances are normally only transient objects used by processors and not literally instantiated as SGML documents.] In addition, the rules defined by an architecture need not cover the entirety of an instance. The HyTime architecture, for example, only covers those parts of documents involved with linking and addressing. Therefore, the mechanism must be flexibile enough to allow both different elements of diffent types in the same document to be derived from different architectures and a single element to be derived from different architectures at once. Because each architecture defines a distinct "processing context", there is no problem in having a single element derived from multiple architectures because the processing for each architecture is independent of the processing for any other architecture. There is no "multiple inheritance" problem because it's not inheritance. It's no different from me saying that I conform to the rules for both male humans and licensed drivers. These are distinct rule domains and as long as the rules for conformance to both do not result in a conflict such that I can't satisfy both at once, there are no problems. [For example, I could also say that I can conform to the rules for licensed drivers and medical cadavers but I obviously can't do both at the same time, because being a cadaver includes a requirement that makes it impossible for me to conform to the rules for licensed drivers.] Note that the assertion made by elements that they conform to a given form is NOT saying "instance element X inherits the *syntactic* properties of architectural form Y". It is saying "instance element X *conforms to* the syntax and semantics of architectural form Y". It is an assertion of conformance or derivation that does not have any implications about the content model of the instance except that it must *allow* (but not necessarily require) instances that conform to the architectural content rules. The only constraints architectural content models impose on instances is the requirement for *potential* conformance. But instances are free to allow content that would not conform, because not all instances will be processed or validated with respect to a given architecture. [There may, however, be a definite processing result that looks or in fact is inheritance, but that's inheritance of processing, which is different from inheritance of local syntactic rules. Object-oriented techniques are a natural way to implement processing because you can reflect the *taxonomic* hierarchy represented by an architecture with programmatic objects.] For example, say I define an architecture for sections in technical manuals with the following architectural content model: In a document, I might have this element type, instances of which can be derived from the Section form: ]> ... .. ... This document is valid with respect to its own rules. It should be clear from inspection that it allows instances that conform to the Section architecture. It also allows instances that do not conform. It should also be clear that the instance does not conform to the Section architecture (even though it asserts conformance by asserting derivation from the Section form). Thus, given an architectural element type, there is no way to predict the content models of conforming instances except to say "it will probably *allow* conforming instances*. Note that given an architectural element type, it is probably easy to *generate* instance content models that will ensure conformance (e.g., just copy the architectural declarations into the instance and change the names, if desired), but combining two or more forms from different architectures into a single element type probably cannot be done programmatically in any satisfactory way because too many arbitrary decisions will have to be made, possibly based on variables that can only be understood or provided by humans (such as when are instances expected to be validated against a particular architectural derivation). It should be clear that any notion of true inheritance of content models from architectures to instances is problematic at best, provably impossible at worst. In addition, it would require that the instance parser have access to all architectural DTDs and be able to synthesize them according to some set of combinatorial heuristics. To my mind, this is a level of processing overhead that is unacceptably high if all conforming parsers must support it. In particular, it seems to be directly at odds with at least one of XML's basic principles (actually, I can think of at least three: enabling small parsers, no options, simplicity of specification). By constrast, you only need to access and use an architectural DTD when you are *validating* with respect to that architecture, which is always an option. Validation is not a requirement for doing architecture-aware processing. A processor for any given architecture presumably has built-in knowledge of the forms in that architecture. In any case, DTD's only enable validation and parsing, not processing, so they are largely irrelevant to the issue of enabling *processing*, which is the primary purpose of architectures. Thus, the use of architectures imposes *no requirements* on instance parsers to do anything more than they have to do today. Validating with respect to an architecture is a choice that users of documents get to make. But, doing such combination in some non-SGML schema syntax is perfectly reasonable to contemplate because at that point you've gone outside the minimum requirements of SGML parsing and by definition there is no requirement that any conforming instance parser do any processing with respect to non-SGML-syntax schemas. Cheers, Eliot --
W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From fmanola at objs.com Fri Apr 17 17:46:20 1998 From: fmanola at objs.com (Frank Manola) Date: Mon Jun 7 17:00:31 2004 Subject: Inheritance in XML (was Re: Problems parsing XML) In-Reply-To: <3.0.32.19980416182640.007448a4@pop.intergate.bc.ca> Message-ID: At 8:55 PM -0700 4/16/98, Tim Bray wrote: >At 10:35 PM 14/04/98 -0500, len bullard wrote: >>> [Chris Maden :] >>> > One fundamental flaw in _XML Complete_ is Holzner's apparent belief >>> > that you must write Java code in order to do anything useful with >>> > XML. > >>Markup doesn't care. That's the beauty of it. :- > >Yes! What he said. As a result of having been a programmer since >A.D. 1979, my faith in interoperable APIs is torn and shredded. >But I think that interoperable syntax is usefully achievable. >Hence, XML. -T. > and Matthew Gertner wrote: >Eliot Kimber indicated some scepticism as to whether OO techniques have >really lived up to their hype. In terms of a controlled environment, they >have. Any C programmer who has moved onto C++ will attest that OO features >make it far easier to write extensible and maintainable code. On the other >hand, the promise that this would lead to interchangeable components that >could be used anywhere has clearly been a flop. Why? For exactly the reason >Tim mentioned in his mail: interoperable APIs never work. You can't >interface with code and expect this interface to apply to any environment >other than the one it was specifically designed for. This is the case >whatever technology you are using (DLLs, Java, JavaBeans, Smalltalk, COM, >CORBA, etc.). Hence XML. These observations about the (at least so far) lack of success with truly interoperable APIs are certainly true, and the potential of interoperable syntax "feels" right, but I wonder to what extent we may be comparing apples and oranges here. Specifically, what do we mean by "interoperable"? Interoperable APIs are hard at least in part because an incredible amount of semantics are (implicitly) built into a typical API (as is suggested by Matthew's comment). Moreover, interoperable APIs are held to a "strict accountability": the programs interacting through them must work without either syntactic or semantic errors (and, with programs, these are typically all bundled up). However, if programs must agree on the precise meanings of tagged data in order to guarantee proper operation when exchanging data (and what else does a fair understanding of "interoperable" mean in this context?), won't the semantics that must be mutually understood be (approximately) just as complex? And don't we then need to consider the mechanism(s) for achieving *that* in our comparisons? After all, it's not enough that the programs be "interoperable" in the sense that they can each "operate" (e.g., read, parse, or even approximately get the meaing) on the other's data; the operation must also be "correct" in a fairly constrained sense. I have in mind all the problems large companies are having merging data from different databases into data warehouses due to sometimes subtle differences in semantics (e.g,, of what a "customer" is), even when the data item names (corresponding to markup) are the same (or, at least, fairly regular). I'm not, here, arguing *against* the idea of interoperable syntax, but I am questioning how easy it will really be to get the degree of "interoperability" we seem to be implicitly expecting. --Frank ----------------------------------------------------------------------- Frank Manola www: http://www.objs.com Object Services and Consulting, Inc. email: fmanola@objs.com 151 Tremont Street #22R voice: 617 426 9287 Boston, MA 02111 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lisarein at finetuning.com Fri Apr 17 18:10:33 1998 From: lisarein at finetuning.com (Lisa Rein) Date: Mon Jun 7 17:00:31 2004 Subject: Inheritance in XML (was Re: Problems parsing XML) References: Message-ID: <35378444.87F0FC5C@finetuning.com> I agree that, in the future, hopefully, these will all become semantic issues (said with tear welling in eye). lisa rein Frank Manola wrote: > > At 8:55 PM -0700 4/16/98, Tim Bray wrote: > >At 10:35 PM 14/04/98 -0500, len bullard wrote: > >>> [Chris Maden :] > >>> > One fundamental flaw in _XML Complete_ is Holzner's apparent belief > >>> > that you must write Java code in order to do anything useful with > >>> > XML. > > > >>Markup doesn't care. That's the beauty of it. :- > > > >Yes! What he said. As a result of having been a programmer since > >A.D. 1979, my faith in interoperable APIs is torn and shredded. > >But I think that interoperable syntax is usefully achievable. > >Hence, XML. -T. > > > > and Matthew Gertner wrote: > >Eliot Kimber indicated some scepticism as to whether OO techniques have > >really lived up to their hype. In terms of a controlled environment, they > >have. Any C programmer who has moved onto C++ will attest that OO features > >make it far easier to write extensible and maintainable code. On the other > >hand, the promise that this would lead to interchangeable components that > >could be used anywhere has clearly been a flop. Why? For exactly the reason > >Tim mentioned in his mail: interoperable APIs never work. You can't > >interface with code and expect this interface to apply to any environment > >other than the one it was specifically designed for. This is the case > >whatever technology you are using (DLLs, Java, JavaBeans, Smalltalk, COM, > >CORBA, etc.). Hence XML. > > These observations about the (at least so far) lack of success with truly > interoperable APIs are certainly true, and the potential of interoperable > syntax "feels" right, but I wonder to what extent we may be comparing > apples and oranges here. Specifically, what do we mean by "interoperable"? > Interoperable APIs are hard at least in part because an incredible amount > of semantics are (implicitly) built into a typical API (as is suggested by > Matthew's comment). Moreover, interoperable APIs are held to a "strict > accountability": the programs interacting through them must work without > either syntactic or semantic errors (and, with programs, these are > typically all bundled up). However, if programs must agree on the precise > meanings of tagged data in order to guarantee proper operation when > exchanging data (and what else does a fair understanding of "interoperable" > mean in this context?), won't the semantics that must be mutually > understood be (approximately) just as complex? And don't we then need to > consider the mechanism(s) for achieving *that* in our comparisons? After > all, it's not enough that the programs be "interoperable" in the sense that > they can each "operate" (e.g., read, parse, or even approximately get the > meaing) on the other's data; the operation must also be "correct" in a > fairly constrained sense. I have in mind all the problems large companies > are having merging data from different databases into data warehouses due > to sometimes subtle differences in semantics (e.g,, of what a "customer" > is), even when the data item names (corresponding to markup) are the same > (or, at least, fairly regular). I'm not, here, arguing *against* the idea > of interoperable syntax, but I am questioning how easy it will really be to > get the degree of "interoperability" we seem to be implicitly expecting. > > --Frank > > ----------------------------------------------------------------------- > Frank Manola www: http://www.objs.com > Object Services and Consulting, Inc. email: fmanola@objs.com > 151 Tremont Street #22R voice: 617 426 9287 > Boston, MA 02111 > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Fri Apr 17 18:16:09 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:00:31 2004 Subject: Character streams vs. Byte Streams Message-ID: <000201bd6a1c$0a7dff60$1e09e391@mhklaptop.bra01.icl.co.uk> James Clark: >This is fine except that it should use byte streams not character >streams. What you get if you are reading from the net or from an >archive or a database or whatever is bytes not characters... I have enormous respect for James's arguments as always but on this one I beg to disagree. The reason I have asked for support for character streams is so that the parser can process not only stuff stored on disc but *the output of another program*. For example, I have an application where the XML document is constructed as the result of an SQL query that pulls together fragments of XML stored in different places in a database. The SQL query, like most other programs I use and write, prefers to output characters rather than bytes. That, after all, is the reason XML was designed to be human-readable. And I have to say that in my experience so far, the parsers are so lightning fast compared with the application that generates the XML or consumes it, that an argument based on saving microseconds will not sway me much. I don't think there is a real problem with the XML spec. This defines the syntax of XML in terms of characters. It requires the parser to accept certain encodings of the character stream as a byte stream, but it permits the parser to accept other encodings and therefore by implication to delegate the decoding of the byte stream to another object in the system. In fact it explicitly recognises that an "external transport protocol" might have a say in the matter, and that is a term we could interpret very widely. Regards, Mike Kay, ICL xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Fri Apr 17 18:18:34 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:00:31 2004 Subject: SAX: Character Stream vs. Byte Stream proposal... References: <3536EE79.9CCD408D@infinet.com> <199804171123.HAA00395@unready.microstar.com> Message-ID: <35371A49.67BFF33B@infinet.com> David Megginson wrote: > Tyler Baker writes: > > > Why not simply have a standard factory that takes any type of > > InputStream (UTF-16, UTF-8, etc) similiar to how the parse method > > works and it returns a type (say CharacterStream) which can then be > > passed to either the parser or the application. In this case the > > implementations for doing all of this low level character reading > > from bytes could be standardized for each platform. > > The problem is that SAX is an API, not an architecture -- that is, it > attempts to impose the fewest possible constraints on implementations. > There are several good reasons for this approach: > > 1. SAX is one of (possibly) many APIs that an XML parser will > implement, and other APIs may make conflicting demands. In this case it usually makes sense to have a separate parser for each API set rather than having code like this: if (parserAPICode == SAX) { // Do SAX parsing } else if (parserAPICode == Foo) { // Do foo parsing } Conditionals like this will greatly depreciate the speed of your code if every method is littered with them. Better to just write a new parser for every new API. Nevertheless, having a standard way for each parser to get at the low level stuff makes sense from a code-reuse as well as consistency standpoint. > 2. XML parsers need to compete on speed, memory usage, etc., and to do > so, they need to be free to take different approaches. I was suggesting that you would still have an interface, but a default implementation for byte to character encoding in the SAX package I feel is perfectly reasonable. I may get flames for this, but I think most parsers will compete on how they solve an application's XML handling problems (the design) not on whether one parser is 1% faster than another. In this case, a default solid implementation for character encoding would allow parser writers to concentrate on coming up with new and interesting ways to allow applications to model XML content, instead of having to worry about bit shifting all over the place. Typically, low-level stuff such as this I feel should be implemented once and then reused over and over again. There are only so many ways to write character encoders / decoders and I would wager that most parsers out there pretty much have very similiar implementations for reading from byte streams. XML's beauty is not in the fact the spec defines support for about 6 or so different character encoding formats, it is in everything else. If another character encoding format comes out, then every SAX parser will have to possibly do a rewrite. If people could agree upon one good efficient dependable implementation, then no one (other than the people doing the 600 or so lines of character encoding implementation code) will have to do a thing. Of course, people could plug in their own character encoder / decoder implementations if they so choose, but at least they would have the choice. I really think it would of made a hell of a lot more sense for XML to have one standard encoding format, say UTF-16 or UTF-8 instead of actually defining in the spec the actual legal encoding formats. It would make much more sense I feel to just convert everything to a UTF-8 or UTF-16 format if documents were indeed in a different format, rather than to force parser writers to handle just about every major character encoding format known to man. One example would be databases which may store XML content in a proprietary character format. An XML parser for the database will need to do this translating anyways from the native character format to something defined in the XML spec (unless you want to deviate from it). Anyways, just some suggestions... Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri Apr 17 18:40:20 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:31 2004 Subject: Inheritance in XML (was Re: Problems parsing XML) In-Reply-To: References: <3.0.32.19980416182640.007448a4@pop.intergate.bc.ca> Message-ID: <3.0.1.16.19980417163544.20dfff1a@pop3.demon.co.uk> At 11:52 17/04/98 -0400, Frank Manola wrote: >At 8:55 PM -0700 4/16/98, Tim Bray wrote: >>>Markup doesn't care. That's the beauty of it. :- >> >>Yes! What he said. As a result of having been a programmer since >>A.D. 1979, my faith in interoperable APIs is torn and shredded. >>But I think that interoperable syntax is usefully achievable. >>Hence, XML. -T. I support this - in a sense I have found that XML acts as a buffer between different 'APIs'. It can hold the structure of the information and - if necessary help support transformations where necessary. It also can relieve the programmer of having to manage some of the structure. [As an example, I hold me menu information in XML. When I wish to create a Java menu I can translate that to the AWT/Swing commands whilst for some other system I can translate it independently. OK - not everything matches precisely but it's pretty good. Of course one can do it without XML - use an abstract tree-based structure - but XML simply makes it natural the think this way. Some of my XML 'code' never actually sees an angle bracket, but it's XML :-) > >These observations about the (at least so far) lack of success with truly >interoperable APIs are certainly true, and the potential of interoperable >syntax "feels" right, but I wonder to what extent we may be comparing >apples and oranges here. Specifically, what do we mean by "interoperable"? >Interoperable APIs are hard at least in part because an incredible amount >of semantics are (implicitly) built into a typical API (as is suggested by >Matthew's comment). Moreover, interoperable APIs are held to a "strict >accountability": the programs interacting through them must work without >either syntactic or semantic errors (and, with programs, these are >typically all bundled up). However, if programs must agree on the precise [.. and more useful stuff...] I think the success of SAX (with complete credit to DavidM, of course) is - in part - that the XML community has spent 10**4 email messages discussing the semantics of what SAX operates on. So we all agree what Attributes, Entities, etc are. [Imagine that someone just invented another language with a different idea of 'entities' in and tried to interface it to XML :-)]. This makes it possible to create an API where we are reasonably happy about resolving semantics. Of course we'll need to make sure that the implementations work harmoniously - there are still possible tweaks where different implementers may take different directions (wh*t*sp*c*, etc.). This will get harder with XLL and even tougher with RDF, etc. That's why I think it's so important to work out these communal APIs and related approaches. I shall revisit the DTD for DTDs in the near future because I think this is a useful place where we all agree (I got that feedback from the list and privately - I'll hack it next time I'm in the pub.). I'd also very much like to see implementation for some of the validity constraints (e.g. I'd like to see Name validation which is straightforward but very easy to get wrong or out-of-date.) Other suggestions for XML-DEV-based implementations/APIs where we all 'agree' on the semantics would be welcome. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Fri Apr 17 18:50:28 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:00:31 2004 Subject: Character streams vs. Byte Streams References: <000201bd6a1c$0a7dff60$1e09e391@mhklaptop.bra01.icl.co.uk> Message-ID: <3537220E.E913E17F@infinet.com> Michael Kay wrote: > James Clark: > >This is fine except that it should use byte streams not character > >streams. What you get if you are reading from the net or from an > >archive or a database or whatever is bytes not characters... > > I have enormous respect for James's arguments as always but on this one I > beg to disagree. The reason I have asked for support for character streams > is so that the parser can process not only stuff stored on disc but *the > output of another program*. For example, I have an application where the > XML document is constructed as the result of an SQL query that pulls > together fragments of XML stored in different places in a database. The SQL > query, like most other programs I use and write, prefers to output > characters rather than bytes. That, after all, is the reason XML was > designed to be human-readable. > > And I have to say that in my experience so far, the parsers are so lightning > fast compared with the application that generates the XML or consumes it, > that an argument based on saving microseconds will not sway me much. This is what I found out personally and why I decided to write my own parser for my application. When using SAX I found that my DocumentHandler implementation was taking up about 75% of the processing time while the parser was only taking 25%. The main reason for this I found is that using the String.equals() method is quite expensive and is really the only good way in SAX for recognizing elements. When I switched to the Object framework I designed the parsing times for my actual parser implementation were lower, but more importantly, the time spent in the application handling the XML content was reduced to less than the time spent in the parser which was a big surprise. > I don't think there is a real problem with the XML spec. This defines the > syntax of XML in terms of characters. It requires the parser to accept > certain encodings of the character stream as a byte stream, but it permits > the parser to accept other encodings and therefore by implication to > delegate the decoding of the byte stream to another object in the system. > In fact it explicitly recognises that an "external transport protocol" might > have a say in the matter, and that is a term we could interpret very widely. Another reason why a CharacterStreamFactory I feel is a good idea. It separates the low-level encoding aspect of characters from the rest of the parser which I feel should only really need to use one type of encoding format in the first place. If there was a default CharacterStreamFactory implementation the following I feel are important issues... - The default implementation should support all of the character encoding formats defined in the XML 1.0 spec - The default implementation should have a way to add in support for custom character encoding formats (like with DB's). - The default implementation should have a mechanism to replace implementations for various encoding streams if the parser writer chooses to do so either for optimization purposes he/she feels is necessary or some other reason. The alternative I feel is never ending code bloat like in the case with current major word processors where they all have endless amounts of kludgy code for reading each others proprietary document formats and in the end just bloat the application's resource consumption significantly. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Fri Apr 17 19:23:20 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:31 2004 Subject: Inheritance in XML (was Re: Problems parsing XML) References: <004f01bd69ee$1b69e6a0$1e09e391@mhklaptop.bra01.icl.co.uk> Message-ID: <35379018.8E9F1EA3@technologist.com> Matthew Gertner: >Nevertheless, inheritance of some sort is absolutely vital if XML is to >fulfill its promise. If we can't produce standard DTDs which can be >extended, *without* modifying the base DTD, then many of the advantages of >XML go out the window. Michael Kay wrote: > > I agree that this is central. Let's leave identity out of the discussion, as > that > does, I think, fall into the XML Linking domain, and concentrate on what I > prefer to call subtyping. You act as if this is just a terminological difference, but it isn't. He is talking about one thing and you are talking about another. He speaks of "Producing standard DTDs which can be extended *without* modifying the base DTD" is inheritance. It can be implemented right now through parameter entity hacks and is not subtyping. You on the other hand seem to be talking about subtyping: > I know some people will disagree, but the way I use XML, a DTD is a > schema, an element definition in a DTD is a class, a document is a > database, and an element within a document is an instance of a class. > What is missing is that we can't define one class (element type) as a > subtype of another. The only reason that the concepts *even intersect* is because a) subtyping without inheritance is often painful and leads to code duplication. I claim that architectural forms and Java "interfaces" are often painful for exactly this reason. Of course in [SG|X]ML, inheritance can be hacked with parameter entities, which is something HyTime does for its architectures. (also HyTime can only be thought of as subtyping if you use it in a restricted form...) b) inheritance without subtying is only occasionally useful. I can't remember the last time I used "private inheritance" in C++ and I don't even remember right now if Java supports it. But the fact that the two concepts work well together does not make them synonyms. They are not. > The main thing that's tricky is that you can get the "is-a" the wrong way > round. If a PREFACE is-a-kind-of CHAPTER, that means you can find > anything (elements, attributes) in a PREFACE that you can find in a chapter, > and more besides. No it doesn't. If PREFACE is-a-kind-of CHAPTER then source code designed to handle chapters should work with prefaces. That means that PREFACE must either directly describe a *subset* of the language described by CHAPTER (i.e. have a constrained content model) or PREFACE must provide "some mechanism" for transforming its content into a language understandable by CHAPTERs. In real world documents, we often want to be able to have subtypes that are also extensions, which means that we need to define some transformational system (as archforms do). This transformational question is exactly what makes subtyping with extension very tricky. Subtyping without extension is trivial. This is why I have stepped back from the question of subtyping with extension and am investigating transformation languages. In particular I am right now looking at Forest Automata theory and a transformation language designed by Makato Murata. > It also means you can reduce a PREFACE to a CHAPTER > by removing these extra bits. I'm not entirely sure what "removing the extra > bits" means: for example should it remove elements that cannot occur > in a CHAPTER, or should it just remove the tags that surround those > elements? This tends to show up the lack of semantics in the object > model underlying XML. That's exactly right. Your confusion is my confusion. The only way out is through transformation languages -- either simple, relatively weak ones like those provided by archtiectural forms, or more powerful (and more complicated? I don't know yet?) ones like those described by Murata-san in his various Principles of Documentation papers. They are at: http://www.geocities.com/ResearchTriangle/Lab/6259/ Unless you are much smarter than me, you will probably not find these light reading, but my hope is that the concepts can be simply expressed in a nice syntax in much the same way that regular expressions hide the nastiness of DFAs. There is in fact such a thing as a regular tree expression that is quite analogous to a regular expression. I don't yet know if these can be hooked up to an easy to use (non-programmable!) transformation language yet. Sorry for the brain dump. I'm late for a meeting. Paul Prescod - http://itrc.uwaterloo.ca/~papresco [Woody Allen on Hollywood in "Annie Hall"] Annie: "It's so clean down here." Woody: "That's because they don't throw their garbage away. They make it into television shows." xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Apr 17 19:38:53 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:31 2004 Subject: SAX: Character Stream vs. Byte Stream proposal... In-Reply-To: <35371A49.67BFF33B@infinet.com> References: <3536EE79.9CCD408D@infinet.com> <199804171123.HAA00395@unready.microstar.com> <35371A49.67BFF33B@infinet.com> Message-ID: <199804171737.NAA01283@unready.microstar.com> Tyler Baker writes: > Typically, low-level stuff such as this I feel should be > implemented once and then reused over and over again. There are > only so many ways to write character encoders / decoders and I > would wager that most parsers out there pretty much have very > similiar implementations for reading from byte streams. XML's > beauty is not in the fact the spec defines support for about 6 or > so different character encoding formats, it is in everything else. > If another character encoding format comes out, then every SAX > parser will have to possibly do a rewrite. If people could agree > upon one good efficient dependable implementation, then no one > (other than the people doing the 600 or so lines of character > encoding implementation code) will have to do a thing. Of course, > people could plug in their own character encoder / decoder > implementations if they so choose, but at least they would have the > choice. The nice thing here is that all of this can be built on top of SAX instead of inside it. Some implementors are already complaining -- quite understandably -- that SAX has grown far too large. However, there is a great opportunity here for someone (or a group of people) to write a separate SAX toolkit that includes what you suggest and much more (such as classes implementing AttributeList and Locator, with copy constructors). All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jeff at texcel.no Fri Apr 17 21:01:37 1998 From: jeff at texcel.no (Jeff Larson) Date: Mon Jun 7 17:00:31 2004 Subject: Inheritance in XML (was Re: Problems parsing XML) In-Reply-To: References: <3.0.32.19980416182640.007448a4@pop.intergate.bc.ca> Message-ID: <3.0.3.32.19980417140316.00a7a100@alterdial.uu.net> At 11:52 AM 4/17/98 -0400, Frank Manola wrote: >These observations about the (at least so far) lack of success with truly >interoperable APIs are certainly true, and the potential of interoperable >syntax "feels" right, but I wonder to what extent we may be comparing >apples and oranges here. Specifically, what do we mean by "interoperable"? >Interoperable APIs are hard at least in part because an incredible amount >of semantics are (implicitly) built into a typical API (as is suggested by >Matthew's comment). This nicely points out something that's been bothering me about all the fervor surrounding XML as a format for data representation. Having your data expressed in XML does not necessarily make it any less proprietary than having it in a nasty old binary file. I contend that the utility of XML diminishes as the complexity of the data model increases. Take for example an Excel spreadsheet. This is essentially the binary serialization of a complex data structure. Assume that magically this becomes an XML document, what now? You can't reason about this in any meaningful way without understanding the semantics of every element. Assuming the semantics are documented, you might be able to extract information from it (which makes it worth having), but it is doubtful that you can modify it reliably. In theory we're now "free" to implement our own spreadsheet editor on top of this "open" data model, but is that really going to happen? No, we'll continue to use the API provided by the vendor because its the implementation of the semantics that's the hard part. Certainly there is a lot that can and should be done with XML, but I'm not convinced its any universal data representation panacea. Jeff xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From robin at ACADCOMP.SIL.ORG Fri Apr 17 21:03:39 1998 From: robin at ACADCOMP.SIL.ORG (Robin Cover) Date: Mon Jun 7 17:00:31 2004 Subject: Inheritance in XML Message-ID: <199804171910.OAA23947@ACADCOMP.SIL.ORG> Apropos of the thread on inheritance and/versus subtyping, and DTD-as-schema, etc. (good posts by Prescod, Kimber, Newcomb, Gertner, Manola, Kay, etc.) 1. I have collected some of these posts in: http://www.sil.org/sgml/sgmlnew.html#inheritance980417 2. The article of Francois Chahuneau ("Beyond the SGML DTD") may be of general interest. See: http://www.sil.org/sgml/chahuneauXML.html -rcc ------------------------------------------------------------------------- Robin Cover Email: robin@acadcomp.sil.org 6634 Sarah Drive Dallas, TX 75236 USA >>> The SGML/XML Web Page <<< Tel: +1 (972) 296-1783 (h) http://www.sil.org/sgml/sgml.html Tel: +1 (972) 708-7346 (w) FAX: +1 (972) 708-7380 ========================================================================= xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Fri Apr 17 21:56:12 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:32 2004 Subject: Inheritance in XML (was Re: Problems parsing XML) References: <3.0.32.19980416182640.007448a4@pop.intergate.bc.ca> <3.0.3.32.19980417140316.00a7a100@alterdial.uu.net> Message-ID: <3537B3EB.BEA7D347@technologist.com> Jeff Larson wrote: > > Certainly there is a lot that can and should be done with XML, but I'm > not convinced its any universal data representation panacea. I don't think that anyone ever claimed it was (well, perhaps the press). But that doesn't mean that the fervour is misplaced. When is the last time you came across a technology that was an "xxx panacea" where "xxx" can be anything? If we only got excited about panaceas, we would live in a pretty boring industry. XML makes the job of defining new easy-to-read, easy-to-parse, easy-to-understand languages much, much easier. You can go out of your way to undermine those features if you want. You can also ignore them. But on average, we will be better off, which is something to get excited about. > Take for example an Excel spreadsheet. This is essentially the binary > serialization of a complex data structure. Assume that magically this > becomes an XML document, what now? You can't reason about this in any > meaningful way without understanding the semantics of every element. > Assuming the semantics are documented, you might be able to extract > information from it (which makes it worth having), but it is doubtful > that you can modify it reliably. You can modify it reliably if the DTD/schema is complete. If not, you guess, just as with a partially documented API. > In theory we're now "free" to > implement our own spreadsheet editor on top of this "open" data model, > but is that really going to happen? No, we'll continue to use > the API provided by the vendor because its the implementation of the > semantics that's the hard part. Not in my experience. There are dozens of tools that I can download that work with RTF, Frame MIF or PDF, and a small handful that talk to the Word, FrameMaker or Adobe Acrobat APIs. Furthermore, the formats described above have multiple independent implementations. The APIs do not. Paul Prescod - http://itrc.uwaterloo.ca/~papresco [Woody Allen on Hollywood in "Annie Hall"] Annie: "It's so clean down here." Woody: "That's because they don't throw their garbage away. They make it into television shows." xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dgulbran at vervet.com Fri Apr 17 22:06:31 1998 From: dgulbran at vervet.com (David Gulbransen) Date: Mon Jun 7 17:00:32 2004 Subject: New XML Pro Beta Available Message-ID: Vervet Logic has released the second public beta for XML Pro, our featured XML editor. The beta is available to download from www.vervet.com. We welcome all feedback, and look forward to releasing a product that will help all levels of users harness the power of XML. Questions regarding Vervet Logic or XML Pro should be addressed to david@vervet.com. XML Pro Features: o Full Validation with Document Type Definition Support o Creation of Well Formed documents with or without DTDs o The Element Wizard for easy element management o The Attribute Wizard for easy attribute management o Support for Entities, CDATA, and Comments o XML "view source" preview o Printing support o Easy to use, graphical user interface New Features in XML Pro Beta2: o Attribute Wizard The Attribute Wizard adds the ability to easily create and remove attributes for existing elements within your document. This eliminates the need to add attributes by hand to every instance of an element. Once attributes are added, only attributes that are assigned values are saved to the XML file. o View PCDATA in Document Tree The "View Text in PCDATA" feature allows you to display the content of PCDATA in the context of the document tree. With this option off, PCDATA will be displayed in the tree as an icon. This gives you the flexibility to see data in context, or to disable the data display when editing large documents. o CDATA Support XML Pro now supports the addition of CDATA to documents. o Comment Support XML Pro now supports the addition of Comments to XML Documents. o View XML The "View XML" feature allows you to preview the XML code that will be written to your XML file. o List Entities The List Entities feature will allow you to see a list of any entities that have been defined in a DTD. o Expand Beneath The Expand Beneath feature allows you to expand the tree beneath the selected element. o Printing Printing of the XML file has been enabled for Beta 2. David Gulbransen v e r v e t l o g i c President and CEO ----------------------- 812.856.5270 | fax 812.855.4506 http://www.vervet.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Patrice.Bonhomme at loria.fr Fri Apr 17 23:09:16 1998 From: Patrice.Bonhomme at loria.fr (Patrice Bonhomme) Date: Mon Jun 7 17:00:32 2004 Subject: Announcement: JavaCC parser for the XML XPointer Language Message-ID: <199804172108.XAA08862@chimay.loria.fr> I wrote a small JavaCC grammar to parse the XML XPointer Language. i included 2 test files (a bad one and a good one) containing the Working Draft examples to test the parser itself. It makes nothing but parsing. I am currently working on a full implementation of the XML Link language based on the MSXML parser. I will make a full package available as soon as possible. All comments are welcome... Requirement Java Parser Generator, JavaCC 0.7.1 - http://www.sun.com/suntest/JavaCC/in dex.html JDK 1.1.x Compile javacc XPointerParser.jj javac -depend XPointerParser.java Using Reading from standard input: java XPointerParser Reading from a file: java XPointerParser bad.xll java XPointerParser good.xll The test file bad.xll was taken from the XPointer Working Draft. I modified this file to make it conformant to the specification (test file good.xll). Availability http://www.loria.fr/~bonhomme/XPointer/ Pat. -- ============================================================== bonhomme@loria.fr | Office : B.228 http://www.loria.fr/~bonhomme | Phone : 03 83 59 30 52 -------------------------------------------------------------- * Serveur Silfide : http://www.loria.fr/projets/Silfide * Projet Aquarelle : http://aqua.inria.fr ============================================================== xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jeff at texcel.no Sat Apr 18 00:12:01 1998 From: jeff at texcel.no (Jeff Larson) Date: Mon Jun 7 17:00:32 2004 Subject: Inheritance in XML (was Re: Problems parsing XML) In-Reply-To: <3537B3EB.BEA7D347@technologist.com> References: <3.0.32.19980416182640.007448a4@pop.intergate.bc.ca> <3.0.3.32.19980417140316.00a7a100@alterdial.uu.net> Message-ID: <3.0.3.32.19980417171321.00a799e0@alterdial.uu.net> At 03:56 PM 4/17/98 -0400, Paul Prescod wrote: >> Take for example an Excel spreadsheet. > >You can modify it reliably if the DTD/schema is complete. If not, you >guess, just as with a partially documented API. Excel was perhaps a bad example, but in general I disagree because the DTD isn't enough to capture all of the various integrity constraints that may exist between elements representing components of a data structure. It may be possible to represent some of these constraints through content models, but certainly not all of them. If the schema was all you needed, then there wouldn't be much point to the OO concept of the "setter" method. You would just make all of your data members public, and tell everyone not to break the rules, however complicated those may be. In practice, this doesn't work, and there is a lot of value to be had in using methods (an API) to ensure that the constraints are not violated. >Not in my experience. There are dozens of tools that I can download that >work with RTF, Frame MIF or PDF, and a small handful that talk to the >Word, FrameMaker or Adobe Acrobat APIs. Furthermore, the formats >described above have multiple independent implementations. The APIs do >not. Certainly there will be a few important and widely used data models around, especially for the representation of documents, and XML is perfect for this. The semantics of the data model will be well understood, enabling anyone with enough time on their hands to write dozens of tools that operate reliably upon the data. To me though, this is still an API, my application isn't parsing the file, its poking at it through the "tool" which is where the semantics have been encapsulated. If XML makes it easier for these eager tool hackers to do their thing, then great, I'm all for it. However, lets say I'm the vendor of some relatively esoteric thing, and I need to design a file format to capture the state of my application. Do I use XML? Sure why not, but do I really gain anything from this? The hackers with time on their hands are busy writing tools to edit RTF and DOOM levels, they don't care about my nuclear power plant application. Even if someone did decide to write a nifty utility that operates on my files, if they get it wrong, then Cleveland starts glowing, so I probably don't want their help anyway. I'm not against XML, I think its a great thing, and we should encourage the vendors of major applications to support it, along with the copious DTD documentation that will be necessary to do anything useful with it. However, I think the notion that just storing my application data in XML will automatically make it more useful to the world is a bit presumptuous. Jeff xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sat Apr 18 01:49:50 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:32 2004 Subject: Inheritance in XML (was Re: Problems parsing XML) In-Reply-To: <3.0.3.32.19980417171321.00a799e0@alterdial.uu.net> References: <3537B3EB.BEA7D347@technologist.com> <3.0.32.19980416182640.007448a4@pop.intergate.bc.ca> <3.0.3.32.19980417140316.00a7a100@alterdial.uu.net> Message-ID: <3.0.1.16.19980417234610.3dd77646@pop3.demon.co.uk> At 17:13 17/04/98 -0500, Jeff Larson wrote: [...] > >However, lets say I'm the vendor of some relatively esoteric thing, and I >need to design a file format to capture the state of my application. Do >I use XML? Sure why not, but do I really gain anything from this? The >hackers with time on their hands are busy writing tools to edit RTF and >DOOM levels, they don't care about my nuclear power plant application. >Even if someone did decide to write a nifty utility that operates on my >files, if they get it wrong, then Cleveland starts glowing, so I probably >don't want their help anyway. > This is actually a good example of where I think XML has a lot to offer. In designing complex systems it is a good idea to re-use well-tested components where possible. If, for example, your power station relies on mathematics, physics, chemistry, safety protocols, etc. then it will make sense to re-use those developed in a community-wide fashion. In an XML document it is straightforward to detect the namespaces used and to separate the components. I find it much easier to extract the separate information components from an XML file than (say) an RTF document. I don't usually contribute to general discussions on XML-DEV but I'd like to urge members to think in terms of re-usable components wherever possible. To what extent we can develop them here I don't know, but IMO it represents a major advance over the proprietary approach. XML documents are open in a way that many others aren't and are often much easier to dissect than objects and relational data. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sat Apr 18 01:55:56 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:32 2004 Subject: Inheritance in XML (was Re: Problems parsing XML) References: <3.0.32.19980416182640.007448a4@pop.intergate.bc.ca> <3.0.3.32.19980417140316.00a7a100@alterdial.uu.net> <3.0.3.32.19980417171321.00a799e0@alterdial.uu.net> Message-ID: <3537EC0F.D604C071@technologist.com> Jeff Larson wrote: > > Excel was perhaps a bad example, but in general I disagree because the DTD > isn't enough to capture all of the various integrity constraints that may > exist between elements representing components of a data structure. It > may be possible to represent some of these constraints through content > models, but certainly not all of them. That's right. That's why you also need more application-specific schemata. That's also why you need documentation, just as you do for an API. > If the schema was all you needed, then there wouldn't be much point > to the OO concept of the "setter" method. You would just make all of > your data members public, and tell everyone not to break the rules, however > complicated those may be. The issues are not analogous. First, Java and C++ do not have a concept of schema. Second, setter methods must control not just what a data object's state is before and after a transaction, but also the steps allowed to get from here to there. Schemata can only constrain documents, not processes. Setter methods can say that you can only set foo after you've set bar and so forth. > Certainly there will be a few important and widely used data models around, No, there will be *many* important and widely used data models around. That is why XML is so exciting. Look at http://www.w3.org/TR . There are new ones every week. > especially for the representation of documents, and XML is perfect for this. > The semantics of the data model will be well understood, enabling anyone > with enough time on their hands to write dozens of tools that operate reliably > upon the data. To me though, this is still an API, my application isn't > parsing the file, its poking at it through the "tool" which is where the > semantics have been encapsulated. I don't understand this point. If I use Python's "print" statement to directly write a CDF file or FrameMaker document, how am I going through an API? > However, lets say I'm the vendor of some relatively esoteric thing, and I > need to design a file format to capture the state of my application. Do > I use XML? Sure why not, but do I really gain anything from this? The > hackers with time on their hands are busy writing tools to edit RTF and > DOOM levels, they don't care about my nuclear power plant application. > Even if someone did decide to write a nifty utility that operates on my > files, if they get it wrong, then Cleveland starts glowing, so I probably > don't want their help anyway. Most of us don't write nuclear plant software, and I don't think that nuclear plant software is designed to be "third-party extensible" either through APIs or data formats. If we're talking about replacing data formats, then we should talk about the most popular data formats: * Word docs/Excel spreadsheets (binary, hard to work with, hard to parse), * HTML (underpowered, inflexible), * configuration files (different on every platform, too often ad hoc), * page description languages (hard to validate, poorly specified) And even formats that cannot realistically be replaced by XML can be enhanced by it; * source code (literate structured programming) * zip files (XML manifests, directories, etc.) > I'm not against XML, I think its a great thing, and we should encourage > the vendors of major applications to support it, along with the copious > DTD documentation that will be necessary to do anything useful with it. > However, I think the notion that just storing my application data in > XML will automatically make it more useful to the world is a bit presumptuous. I think you are attacking a straw person. Certainly nobody knowledgable would claim that XML automatically improves anything. Like every technology it must be applied to the right set of problems. On average, though, storing data in XML rather than whatever you would have invented ad hoc will make that data easier to work with. It's analogous to the situation of Unicode/UTF-8 giving people a character encoding to build on rather than having them invent their own. Sometimes it will still make sense to invent your own character or data encoding. But more often, it will be easier for everybody to just use the standard. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Journalism is good if you follow the rules. Don't allow the human rights groups to spoil your profession" - Col. Godwin Ugbo of the Nigerian military dictatorship xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From kent at trl.ibm.co.jp Sat Apr 18 02:11:43 1998 From: kent at trl.ibm.co.jp (TAMURA Kent) Date: Mon Jun 7 17:00:32 2004 Subject: Update IBM XML for Java Message-ID: <199804180010.JAA23633@ns.trl.ibm.com> XML for Java, XML processor in Java, has been updated. http://www.alphaworks.ibm.com/formula/xml Changes: 06-Feb-1998 to 16-Apr-1998 Updated DOM spec. Updated Namespace spec. Document#print() adds no extra white-spaces. Added new class: com.ibm.xml.parser.Format Added TXElement#getNthElementByTagName() Renamed StartTagHandler to TagHandler and added handleEndTag() method. Added StreamProducer#closeInputStream() Element Digest with MD5 Added new package: com.ibm.xml.xpointer Added new class: com.ibm.xml.parser.StylesheetPI And fixed many bugs -- TAMURA, Kent @ IBM Japan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Apr 18 03:00:12 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:32 2004 Subject: SAX: Byte Streams and Character Streams Message-ID: <199804180058.UAA00420@unready.microstar.com> James Clark has recently raised the issue of byte streams and character streams, and I think that we need to give this a thorough discussion before my self-imposed deadline of next Tuesday. SAX is based on two general principles, sometimes violated: 1) SAX shall provide the minimum required information in the simplest form possible. 2) SAX shall impose as few constraints as possible on the architecture of XML parser implementations. One possible violation of both principles is my recent suggestion (implemented in yesterday's pre-release) that SAX support stream-based parsing only from character streams and not from byte streams -- the problem is that James's XP parser works directly from undecoded byte streams for the sake of speed, and this decision requires XP to reencode the character stream before parsing it. There are three possible parsing situations: a) the application provides the parser with a URL pointing to an XML entity; b) the application has access to characters (perhaps in a buffer, or or from a database), and it provides them to the parser in a character stream; and c) the application has access to raw bytes (perhaps from a file or a URL connection), and it decodes them and provides them to the parser in a character stream. In the first two situations, the absence of a byte stream creates no inefficiencies -- in (a), the XML parser can read a raw byte stream itself, and in (b), either the application or XP will have to encode the characters into bytes. The only inefficiency is in (c), where the application will decode the byte stream only to have it re-encoded into bytes by XP, and this is an inefficiency only if the SAX parser happens to work directly from raw bytes without decoding them first. I'd like to hear from XML application writers: which of the above situations do you find most typical? Is (c) a common situation for you? Is it common enough that it is worth enlarging SAX and complicating the EntityResolver interface? Thanks, and all the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Apr 18 03:02:48 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:32 2004 Subject: SAX: Implied Attributes Message-ID: <199804180100.VAA00424@unready.microstar.com> Should SAX level 1 allow parsers to report implied attributes by providing null for AttributeList.getValue() (it does not currently do so)? The only application I can think of right now is architectural forms, where an implied attribute can prevent automatic derivation. Thanks, and all the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Apr 18 03:15:05 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:32 2004 Subject: Finishing SAX Message-ID: <199804180113.VAA00482@unready.microstar.com> [I don't think that my first two attempts made it through, so here's a third.] It's time to finish SAX level 1: many people (both parser and application writers) have been waiting patiently, and I think that we're probably well past the 80 part of the 80/20 rule: no matter what we decide, SAX will be less well suited for some applications and parsers than for others, and there will certainly be smug comments in the future about how we got obvious things wrong (the kind of comments that I, in moments of weakness, have been heard to make about other people's APIs). I had originally planned SAX as two tiny interfaces occupying 1 or 2 kilobytes, with extremely limited functionality. What we've ended up with is the collective design of the XML membership, which is considerably larger and more complex than I had originally planned (though sax.jar file is still only 8,174 bytes long), but also much more functional and elegant -- it's not what I wanted, but I have to confess that I like it quite a bit. I'd like to suggest that we allow a few more days for discussion, then simply stop at the end of the day next Tuesday (23 April) and give me the rest of the week (and possibly the weekend) to put together the final, official SAX level-one release. If you have issues, speak now, or forever hold your peace (at least when I'm in the room). In a separate message, I'll revisit the issue of byte and character streams. As soon as we have this out of the way we can start talking about SAX level 2, which can support non-structural document events (like comments and CDATA sections), together with much more DTD information -- I already have some draft interfaces sketched out. Thanks, and all the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mamster at webeasy.com Sat Apr 18 04:30:50 1998 From: mamster at webeasy.com (Michael Amster) Date: Mon Jun 7 17:00:32 2004 Subject: Nesting XML based languages and scripting languages Message-ID: Hi: I am new to the world of XML and have been following the development of SAX, DOM and some of the interfaces with great interest. We are looking to change the language of our server side application to an XML based language from a proprietary language today. The application is similar to ColdFusion in that it intersperses commands, queries and control flow with HTML for output to a client. We have looked at XSL and think it is too limited to HTML as an output - we'd like to work with any XML based language for output on the server side. Our lack of separation between code and presentation is not elegant, but it is easy to use and is widely accepted (i.e. ColdFusion) My questions are: 1. Can we use a DTD to intersperse our language (WEASEL) with any arbitrary XML based language in PCDATA sections. We feel that having a DTD for the language would really help to allow authoring with an XML authoring environment, but because we wish to work with any arbitrary XML language, we are not sure of how to create a DTD that allows this. For example: i=0 i < 10 i = i + 1 This is loop #

2. How is HTML 4.0 following the well formed constraint for Java/ECMAScript when < and & are not allowed in Attribute values (currently the onmouseover, onclick and other events are allowed in the attribute value. Furthermore, how will this be handled in the element? Same problem by my understanding of the DTD? Any advice or direction on how these problems are handled would be appreciated. The application in question is (http://www.webeasy.com/products/weasel.htm). -MA ~-~-~-~-~-~-~-~-~-~-~-~-~-~-WEBEASY-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~ Michael Amster mamster@webeasy.com 4676 Admiralty Way, Suite 300 Tel: 310.576.0770 Marina Del Rey, CA 90292 Fax: 310.576.2011 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sat Apr 18 04:50:03 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:00:32 2004 Subject: SAX: Byte Streams and Character Streams References: <199804180058.UAA00420@unready.microstar.com> Message-ID: <35380FDA.C0B1CEB1@jclark.com> Let's not forget other languages, in particular C and C++. In C terms a character stream would be a stream of wchar_t's, and a byte stream would be stream of char's. It's very common to pass information around internally in char's (ie UTF-8 encoded) rather than in a stream of wchar_t's (ie UTF-16 encoded): for example, expat which is being used both in Netscape 5 and in Perl passes data to the application in UTF-8 as a sequence of bytes not as a sequence of wchar_t's. Supporting byte streams only in the C/C++ world causes no inefficiency: if you have the data as an array of wchar_t's, you can simply cast your wchar_t* to a char* and you get an array of UTF-16 encoded bytes. Byte streams gives you all you need in the C/C++ world. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sat Apr 18 04:50:51 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:00:32 2004 Subject: SAX: New Idea for Entity Resolution References: <199804151621.MAA02638@unready.microstar.com> <3536AFCD.54B7E396@jclark.com> <199804171143.HAA00485@unready.microstar.com> Message-ID: <353811A1.C4169A40@jclark.com> David Megginson wrote: > * Both Byte and Character streams > > Pro: - keeps everyone happy > > Con: - requires more interfaces > - requires another method in the Parser interface > - requires a new SAX class encapsulating a ByteStream and its > recommended encoding (or perhaps the ByteStream interface > will have a getEncoding() method) > - will greatly complicate the EntityResolver mechanism (the > application will need to be able to return a byte stream _or_ > a character stream -- how could I handle this?) You could just have a class that encapsulates a structure with three members: - a CharacterStream - a ByteStream - a String At least one of the CharacterStream and ByteStream must be non-null. If the ByteStream is non-null the String can specify the encoding. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mtbryan at sgml.u-net.com Sat Apr 18 09:51:11 1998 From: mtbryan at sgml.u-net.com (Martin Bryan) Date: Mon Jun 7 17:00:32 2004 Subject: Inheritance in XML (was Re: Problems parsing XML) Message-ID: <01bd6a9e$92ae10a0$2b8577c2@sgml.u-net.com> Michael Kay wrote: >I know some people will disagree, but the way I use XML, a DTD is a >schema, an element definition in a DTD is a class, a document is a >database, and an element within a document is an instance of a class. >What is missing is that we can't define one class (element type) as a >subtype of another. In SGML you can use exclusions to make an element a true subclass of another: providing a, b and c are optional components within the model for Y. Unfortunately XML dropped this useful option from the set of SGML facilities it in inherited Martin Bryan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sat Apr 18 12:35:45 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:33 2004 Subject: Finishing SAX In-Reply-To: <199804180113.VAA00482@unready.microstar.com> Message-ID: <3.0.1.16.19980418072956.209f5dd2@pop3.demon.co.uk> At 21:13 17/04/98 -0400, David Megginson wrote: Firstly we owe an enormous debt to David for his effort which was *far* more than I imagined (and I'm sure more that he imagined as well). Past history had shown that good intentions on XML-DEV didn't always lead to finished robustness, and we'd also been down this road before. I think that his self-imposed deadlines have been extremely useful and the discipline of the process. [...] >It's time to finish SAX level 1: many people (both parser and >application writers) have been waiting patiently, and I think that >we're probably well past the 80 part of the 80/20 rule: no matter what >we decide, SAX will be less well suited for some applications and I think this is absolutely right. It's right to finish now. One very important message is that *to get the interoperability that we want in XML* we have to work very hard at the basics. Some of the 'simpler' issues have turned out to be quite complex. However it's also clear that it is of enormous benefit - without SAX I would have wasted more time than the effort I have put into helping David and the process. And this is surely true also for other developers. >parsers than for others, and there will certainly be smug comments in >the future about how we got obvious things wrong (the kind of comments >that I, in moments of weakness, have been heard to make about other >people's APIs). I don't think there will be smug comments, especially since the process has been open and the community 'owns' the result, any more than there are smug comments about how 'XML got it wrong'. The balance is between technical issues and people's ability to use the result effectively. It could be valuable to present the *process* in the final version since I believe it's as good as can be achieved by this - and perhaps any - process. > >I had originally planned SAX as two tiny interfaces occupying 1 or 2 >kilobytes, with extremely limited functionality. What we've ended up >with is the collective design of the XML membership, which is >considerably larger and more complex than I had originally planned >(though sax.jar file is still only 8,174 bytes long), but also much >more functional and elegant -- it's not what I wanted, but I have to >confess that I like it quite a bit. I'd agree with this analysis. From an application programmer's point of view the overall interface has a lot of functionality and to understand it all involves a number of distinct issues, the latest being character management. This is part of the learning and investment process - the good news is that by learning SAX an implementer will learn a great deal about XML systems in general - exceptions, streams, components of an XML document, etc. A key resource - which David has provided, but which may be worth commenting on further - is the provision of 'default' or introductory implementations. An analogy with the Java SwingSet may be useful. This has *zillions* of packages, with relatively little documentation and examples for some of them. However there are special classes for 'beginners', such as DefaultMutableTreeNode. [This is one of the main classes that I use to build an interface to SAX] This provides 'almost everything' that the newcomer needs to get off the ground very fast. In a similar way, SAXDemo (if it's called that still) is *the* place to start until you need special functionality. So documentation and guidance for newcomers is critical - and I hope to address some of these in JUMBO2 when I get the final release of SAX. > >I'd like to suggest that we allow a few more days for discussion, then >simply stop at the end of the day next Tuesday (23 April) and give me >the rest of the week (and possibly the weekend) to put together the >final, official SAX level-one release. If you have issues, speak now, >or forever hold your peace (at least when I'm in the room). In a >separate message, I'll revisit the issue of byte and character >streams. I have kept quiet on issues such as character streams and error handling, trusting is the communal judgment of XML-DEV to get this 'right'. It will be important to give a road map of the interface and - where possible - to identify those components which can be re-used outside SAX. I assume that the current discussion will have been useful to those considering the DOM API and how it can be implemented. There is also a role for library routines at this level. For example 'makeAbsoluteURL()' is useful elsewhere and could reasonably be highlighted in the SAX distribution. [This is not strictly an API matter, but would bring benefits.] In the same way generic tools for parser implementers such as Name validation would be useful, and it might be useful to compile a list of sax.Util in the distribution. > >As soon as we have this out of the way we can start talking about SAX >level 2, which can support non-structural document events (like >comments and CDATA sections), together with much more DTD information >-- I already have some draft interfaces sketched out. Wow! P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sblackbu at erols.com Sat Apr 18 15:15:11 1998 From: sblackbu at erols.com (Samuel R. Blackburn) Date: Mon Jun 7 17:00:33 2004 Subject: Attribute question Message-ID: <002401bd6acc$037dca50$fc82accf@sammy> I'm getting ready to release the next version of my freeware class library (which includes some XML capability) and have reached a stumbling block. What does ^ mean (in rule [10])? Is this a valid attribute: Should the value of the "Name" attribute be "Sam's Bagel-O-Rama"? TIA, Sam Blackburn http://ourworld.compuserve.com/homepages/sam_blackburn/wfc.htm xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Apr 18 16:19:42 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:33 2004 Subject: SAX: New Idea for Entity Resolution In-Reply-To: <353811A1.C4169A40@jclark.com> References: <199804151621.MAA02638@unready.microstar.com> <3536AFCD.54B7E396@jclark.com> <199804171143.HAA00485@unready.microstar.com> <353811A1.C4169A40@jclark.com> Message-ID: <199804181417.KAA00374@unready.microstar.com> James Clark writes: > You could just have a class that encapsulates a structure with three > members: > > - a CharacterStream > - a ByteStream > - a String > > At least one of the CharacterStream and ByteStream must be non-null. If > the ByteStream is non-null the String can specify the encoding. [Read on to the bottom for a large-ish design change.] This implies, then, the following three interfaces: public interface ByteStream { public abstract int read () throws SAXException; public abstract int read (byte b[], int start, int count) throws SAXException; } public interface CharacterStream { public abstract int read () throws SAXException; public abstract int read (char ch[], int start, int count) throws SAXException; } public class InputSource { // For each variable, imagine a get/set pair instead... public ByteStream byteStream; public CharacterStream characterStream; public String encoding; } The nice thing here is that all of these can live on separate systems in a distributed environment: the InputSource can be a C-program on a VAX, the CharacterStream can come a Python program running under alpha Linux, and the parser can be running in Java on a Windows box. There is no dependency on language- or system-specific features (except for java.lang.String, which should be able to map predictably to other languages). Now, why not take this a step further? public class InputSource { // For each variable, imagine a get/set pair instead... public String publicId; public String systemId; public ByteStream byteStream; public CharacterStream characterStream; public String encoding; } We'd have to define rules of precedence: 1) if there is a character stream, use it; 2) if there is no character stream but there is a byte stream, use the byte stream; 3) if there is neither a character stream nor a byte stream but there is a system identifier, open a connection to the system identifier; 4) if there is no character stream, byte stream, or system identifier, throw an exception (or invoke the ErrorHandler). Now, we can get away with only one parse() method in org.xml.sax.Parser: public abstract void parse (InputSource source) throws Exception; It might still be useful to keep two separate methods in EntityResolver, though: public interface EntityResolver { public String resolveSystemId (String publicId, String systemId) throws SAXException; public InputSource openEntity (String systemId) throws Exception; } Comments? All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sat Apr 18 16:31:27 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:33 2004 Subject: Inheritance in XML (was Re: Problems parsing XML) References: <01bd6a9e$92ae10a0$2b8577c2@sgml.u-net.com> Message-ID: <3538B954.EA655ED7@technologist.com> Martin Bryan wrote: > > > In SGML you can use exclusions to make an element a true subclass of > another: > > > > providing a, b and c are optional components within the model for Y. Element X is not a true subclass or subtype. Given a content model: You cannot use an X. What you've done above is make an element whose content model is more restrictive than some other content model. You can also do that without exclusions. I don't think I've ever used exclusions in that way. One big problem is that the exclusion doesn't just change the content model, but the content model of all of X's children. You don't want that if all you need is content model subsetting. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Journalism is good if you follow the rules. Don't allow the human rights groups to spoil your profession" - Col. Godwin Ugbo of the Nigerian military dictatorship xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sat Apr 18 16:37:16 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:33 2004 Subject: Attribute question References: <002401bd6acc$037dca50$fc82accf@sammy> Message-ID: <3538BAA4.157613E8@technologist.com> The ^ symbol means exclude characters in this set. So [^"] means anything other than a ". > Is this a valid attribute: > > No. The ^ is not an escape character. It has no special meaning in XML documents *at all*. It is just used in the notation of the XML specification itself. Production 10 says that you can't include single quote chars in attribute values delimited by single quote characters unless you encode the with entities. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Journalism is good if you follow the rules. Don't allow the human rights groups to spoil your profession" - Col. Godwin Ugbo of the Nigerian military dictatorship xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From robin at ACADCOMP.SIL.ORG Sat Apr 18 19:11:53 1998 From: robin at ACADCOMP.SIL.ORG (Robin Cover) Date: Mon Jun 7 17:00:33 2004 Subject: Inheritance in XML Message-ID: <199804181718.MAA24752@ACADCOMP.SIL.ORG> > Re: Subject: Re: Inheritance in XML (was Re: Problems parsing XML) > Date: Sat, 18 Apr 1998 08:49:07 +0100 > Reply-To: "Martin Bryan" >>What is missing is that we can't define one class (element type) as a >>subtype of another. > In SGML you can use exclusions to make an element a true subclass of > another: > > > > providing a, b and c are optional components within the model for Y. > Unfortunately XML dropped this useful option from the set of SGML facilities > it in inherited > > Martin Bryan Martin, I wish I could believe this were true and useful. It seems that we confront here one of the several troublesome mismatches between OO database modeling and SGML/XML markup, with respect to the simple analogy: OODB SGML/XML Markup class defn element declaration class name element type object element attribute attribute If we accept this crude analogy, and accept SGML's notion of an "attribute" as a name-value pair, then the hope of creating subclasses through SGML/XML element declarations appears slim. Appears "to me" I should say: I would welcome comments from the experts. For starters, subclassing normally would mean further specialization by the addition (possibly 'plus subtraction') of properties, viz., of attributes. Formally, then, an SGML element declaration can't do the work: it would need to be an ATTLIST declaration. But then we face the problem that you can't model a complex attribute with the SGML 'attribute' anyway (if you want any validation): the "value" in '(name-)value' is a flat/string in SGML, at least in the literal sense. Of course, one can (and we all do) model "real" attributes using SGML elements -- since we have no realistic alternative -- but that creates other problems for the notion of using SGML element decls as a subclassing mechanism. One such problem is that (real) attributes are unordered. The straightforward way to model an object/element with (some optional, some required) attributes a, b, c, d, e, and f would seem to be: (a* & b? & c? & d & e?), but SGML/XML notions of prescribing order in the serialization are fairly strong, and XML won't even allow the use of the AND connector to indicate what I plainly mean in this sample assertion. (Perhaps Steph Tryphonas has written a program by now to convert all content models using AND to use only OR, without sacrificing any integrity constraints on occurrence and sequence). In any case, the impulse toward serialization in SGML -- at least in practice, given tools that force end users to reckon with (arbitrary non-intuitive) "order" based upon sequence rules in content models -- tends to work against the easy use of SGML elements to model attributes. Even apart from these mismatches between "object" modelling and SGML/XML encoding, I question whether " " creates a useful "true subclass." Why would one want to create a subclass based upon the subtraction of optional "attributes" (subelements)? I think that would make it a superclass in many OO systems. In this connection, one might be inclined to argue that the treatment of "content" as a special attribute is unfortunate, at least from the perspective of data modelling, where "part-whole" has no quintessential role vis-a-vis "is-a" or "has-a" or "kind-of" or "points-to"... At which point, others would quickly point out that they think it's specious to be talking about object modeling in terms of SGML-based markup languages anyway, since "these languages can neither formally express nor enforce semantic integrity constraints which are so critical to good object modelling..." I think this all leads me in the direction of favoring the efforts at defining other schema languages (beyond SGML/XML DTD syntax), granting that the validation of instances against their schemas, if/when critical, will need to be done outside the framework of the SGML/XML "parser/processor" as defined. I have little doubt that someone as brilliant as Eliot can show how the desired objectives might be met through architecture processing by an appropriate architecture engine; I don't know whether this is the "best" path in all cases, or whether SGML/XML users will want to deal with all the layers of indirection that architectures seem to want. I hope that experts with some years of experience in OO systems will contribute their insights to the new "schema" projects. -rcc ------------------------------------------------------------------------- Robin Cover Email: robin@acadcomp.sil.org 6634 Sarah Drive Dallas, TX 75236 USA >>> The SGML/XML Web Page <<< Tel: +1 (972) 296-1783 (h) http://www.sil.org/sgml/sgml.html Tel: +1 (972) 708-7346 (w) FAX: +1 (972) 708-7380 ========================================================================= xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Apr 18 21:36:20 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:33 2004 Subject: SAX: String Internalisation and a CORBA/DCOM Question Message-ID: <199804181934.PAA00309@unready.microstar.com> Here's another last-minute SAX question: should org.xml.sax.Parser expose a method for internalising strings? public abstract String intern (String s); Most Java-based parsers, at least, already use some type of internalisation (but not, usually, the inefficient java.lang.String.intern() method) for names -- the SAX driver could expose this functionality if support is already there, or do its own internalising if support is absent. As someone has already pointed out, internalised strings will make a dramatic difference for the speed of applications, since applications can use a simple '==' operator (or the local equivalent) to test for equality rather than a slow subroutine like java.lang.String.equals(). My only concern has to do with distributed environments: is it possible to use internalisation with CORBA or DCOM? In other words, is there a way to guarantee that an object broker returns what turns out to be the same object/pointer during different calls? Help or advice will be gratefully accepted. By the way, here's the minimum list of what should be internalised in the callbacks from the SAX parser: - element type names in DocumentHandler.startElement and DocumentHandler.endElement - attribute names in AttributeList.getName() - attribute types in AttibuteList.getType() (both variants) There are other candidates, such as tokenised attribute values, PI targets, and notations and entity names. How large should the list be? Thanks, and all the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sat Apr 18 21:38:27 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:33 2004 Subject: SAX: XML Declaration Message-ID: <199804181936.PAA00323@unready.microstar.com> Here's another last-minute SAX question. I have avoided providing information about the XML declaration, considering it strictly a lexical matter; however, since we are allowing applications to do their own entity resolution, should they have access to the value of the 'standalone' pseudo-attribute (if specified)? I'll look forward to hearing opinions. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Apr 19 04:34:25 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:33 2004 Subject: SAX: 1998-04-18 pre-release Message-ID: <199804190232.WAA04880@unready.microstar.com> One of the last pre-releases of the Java reference version of SAX is available from the following URLs: Core Distribution (including ROADMAP.txt): http://www.microstar.com/XML/SAX/New/saxjava-19980418.zip Drivers for Lark and MSXML: http://www.microstar.com/XML/SAX/New/saxdrivers-19980418.zip Simple Demos: http://www.microstar.com/XML/SAX/New/saxdemos-19980418.zip Changes from 1998-04-16 to 1998-04-18: - added InputSource class to hold a public id, system id, byte stream, and/or character stream - added a ByteStream interface, similar to CharacterStream - modified methods in Parser and EntityResolver to support the new InputSource - added Java-specific helpers ByteStreamAdapter and InputStreamAdapter - modified sample Lark and MSXML drivers so that they can support byte streams and character streams (though not very efficiently) All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sun Apr 19 07:45:27 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:00:33 2004 Subject: SAX: XML Declaration References: <199804181936.PAA00323@unready.microstar.com> Message-ID: <35398221.352A5117@jclark.com> David Megginson wrote: > > Here's another last-minute SAX question. > > I have avoided providing information about the XML declaration, > considering it strictly a lexical matter; however, since we are > allowing applications to do their own entity resolution, should they > have access to the value of the 'standalone' pseudo-attribute (if > specified)? Definitely not. In my view the whole idea of having resolveEntity return null to prevent inclusion of entities is a bad one. There needs to be better control over entity inclusion in SAX, but this is not an effective way to provide it. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sun Apr 19 07:45:52 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:00:33 2004 Subject: SAX: String Internalisation and a CORBA/DCOM Question References: <199804181934.PAA00309@unready.microstar.com> Message-ID: <35398B7C.D84EC87D@jclark.com> David Megginson wrote: > > Here's another last-minute SAX question: should org.xml.sax.Parser > expose a method for internalising strings? > > public abstract String intern (String s); Absolutely not. > Most Java-based parsers, at least, already use some type of > internalisation (but not, usually, the inefficient > java.lang.String.intern() method) for names -- the SAX driver could > expose this functionality if support is already there, or do its own > internalising if support is absent. That would be a significant performance hit on SAX use with parsers that don't do internalisation. XP does not do this sort of internalisation because it would make it slower. > As someone has already pointed out, internalised strings will make a > dramatic difference for the speed of applications, since applications > can use a simple '==' operator (or the local equivalent) to test for > equality rather than a slow subroutine like java.lang.String.equals(). Doing lots of comparisions on the type of each element whether using equals or == is not a good way to write an efficient application. It's typically better to have a hash-table that associates each element type with either an integer (which you can then use in a switch statement) or an object (which you then make a method call on). This could be done a little more efficiently with help from the parser. For example, you could have a method on SAXParser setElementTypeUserData(String elementType, Object userData); Then startElement() and endElement() in SAXDocumentHandler could have an additional Object userData argument. This would allow apps to do something like: void startElement(String name, Object userData, SAXAttributeList atts) { switch (((Integer)userData).intValue()) { ... } } or void startElement(String name, Object userData, SAXAttributeList atts) { ((ElementHandler)userData).start(); } I don't think it's worth the complexity. > By the way, here's the minimum list of what should be internalised in > the callbacks from the SAX parser: SAX should not require the internalization of anything. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sun Apr 19 07:48:13 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:00:33 2004 Subject: SAX: New Idea for Entity Resolution References: <199804151621.MAA02638@unready.microstar.com> <3536AFCD.54B7E396@jclark.com> <199804171143.HAA00485@unready.microstar.com> <353811A1.C4169A40@jclark.com> <199804181417.KAA00374@unready.microstar.com> Message-ID: <35398CE4.6183B8A6@jclark.com> David Megginson wrote: > > James Clark writes: > > > You could just have a class that encapsulates a structure with three > > members: > > > > - a CharacterStream > > - a ByteStream > > - a String > > > > At least one of the CharacterStream and ByteStream must be non-null. If > > the ByteStream is non-null the String can specify the encoding. > > [Read on to the bottom for a large-ish design change.] > > This implies, then, the following three interfaces: > > public interface ByteStream { > public abstract int read () > throws SAXException; > public abstract int read (byte b[], int start, int count) > throws SAXException; > } > > public interface CharacterStream { > public abstract int read () > throws SAXException; > public abstract int read (char ch[], int start, int count) > throws SAXException; > } Why are the single character read calls there? They unnecessarily complicates the interface. > public class InputSource { > // For each variable, imagine a get/set pair instead... > public ByteStream byteStream; > public CharacterStream characterStream; > public String encoding; > } > > The nice thing here is that all of these can live on separate systems > in a distributed environment: the InputSource can be a C-program on a > VAX, the CharacterStream can come a Python program running under alpha > Linux, and the parser can be running in Java on a Windows box. There > is no dependency on language- or system-specific features (except for > java.lang.String, which should be able to map predictably to other > languages). > > Now, why not take this a step further? > > public class InputSource { > // For each variable, imagine a get/set pair instead... > public String publicId; > public String systemId; > public ByteStream byteStream; > public CharacterStream characterStream; > public String encoding; > } > > We'd have to define rules of precedence: > > 1) if there is a character stream, use it; > > 2) if there is no character stream but there is a byte stream, use the > byte stream; > > 3) if there is neither a character stream nor a byte stream but there > is a system identifier, open a connection to the system identifier; > > 4) if there is no character stream, byte stream, or system identifier, > throw an exception (or invoke the ErrorHandler). > > Now, we can get away with only one parse() method in > org.xml.sax.Parser: > > public abstract void parse (InputSource source) > throws Exception; I don't think this is a good idea: it makes SAX harder to use in the simple case of reading from a URL. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sun Apr 19 09:19:25 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:00:33 2004 Subject: SAX: 1998-04-18 pre-release References: <199804190232.WAA04880@unready.microstar.com> Message-ID: <3539A3E8.538AC81B@jclark.com> Making the read methods on ByteStream and CharacterStream throw SAXException seems wrong to me. In a Java environment I need to be able to throw an IOException. So they should be declared as throwing either Exception or IOException. The approach in InputStreamAdapter of just passing through the message from IOExceptions is not acceptable. An application may need to deal with different classes of IOExceptions differently, so it needs to be possible to propagate the IOException up to Parser.parse. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sun Apr 19 11:15:46 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:34 2004 Subject: SAX: 1998-04-18 pre-release In-Reply-To: <199804190232.WAA04880@unready.microstar.com> Message-ID: <3.0.1.16.19980419090045.0ccf0018@pop3.demon.co.uk> At 22:32 18/04/98 -0400, David Megginson wrote: >One of the last pre-releases of the Java reference version of SAX is >available from the following URLs: David is doing a fantastic job! > >Core Distribution (including ROADMAP.txt): > http://www.microstar.com/XML/SAX/New/saxjava-19980418.zip I didn't find ROADMAP.txt either in the *.zip, sax.jar or under the URL. > >Drivers for Lark and MSXML: > http://www.microstar.com/XML/SAX/New/saxdrivers-19980418.zip This may not be high-priority, but is there any check to make sure that the driver is accessing the right version of Lark or MSXML? It's quite easy to get fooled here as one could download a new version of SAX which catered for a new version of FOO using FOODriver that one wasn't aware of. Tedious. If the new version used different classes then one could test with the ClassLoader. > >Simple Demos: > http://www.microstar.com/XML/SAX/New/saxdemos-19980418.zip > Demos are always very gratefully received. For a simple-minded person like me it would be nice to have a two-line demo for CharacterStream input (yes, I haven't got to grips with java.io.Reader yet). Of the form: String input = "Hello world!"; ... P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sun Apr 19 11:15:54 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:34 2004 Subject: SAX: 1998-04-18 pre-release In-Reply-To: <3539A3E8.538AC81B@jclark.com> References: <199804190232.WAA04880@unready.microstar.com> Message-ID: <3.0.1.16.19980419091407.3707c91c@pop3.demon.co.uk> At 14:12 19/04/98 +0700, James Clark wrote: >Making the read methods on ByteStream and CharacterStream throw >SAXException seems wrong to me. In a Java environment I need to be able >to throw an IOException. So they should be declared as throwing either >Exception or IOException. The approach in InputStreamAdapter of just >passing through the message from IOExceptions is not acceptable. An >application may need to deal with different classes of IOExceptions >differently, so it needs to be possible to propagate the IOException up >to Parser.parse. I agree with James. By passing the message alone you lose the information such as where the exception originally occurred. I have struggled with this a lot - if a program reduces everything to Exception it can be difficult to document it but if all exceptions are passed up then it gets very messy. [In JUMBO I use a JumboException - rather like SAXException, but use it to *contain* other exceptions rather than lose their information. This is probably an interim solution until the hierarchy gets sorted out.] P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Apr 19 13:29:06 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:34 2004 Subject: SAX: XML Declaration In-Reply-To: <35398221.352A5117@jclark.com> References: <199804181936.PAA00323@unready.microstar.com> <35398221.352A5117@jclark.com> Message-ID: <199804191127.HAA00273@unready.microstar.com> James Clark writes: > > I have avoided providing information about the XML declaration, > > considering it strictly a lexical matter; however, since we are > > allowing applications to do their own entity resolution, should they > > have access to the value of the 'standalone' pseudo-attribute (if > > specified)? > > Definitely not. In my view the whole idea of having resolveEntity > return null to prevent inclusion of entities is a bad one. There needs > to be better control over entity inclusion in SAX, but this is not an > effective way to provide it. I agree entirely with your last point -- in fact, in the most recent two pre-releases, a return value of null means 'let the parser perform the default action' rather than 'skip the entity' (it is still possible for an application to skip an entity by returning an empty CharacterStream or ByteStream, but that is no longer an officially-document part of SAX). All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Apr 19 13:40:59 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:34 2004 Subject: SAX: New Idea for Entity Resolution In-Reply-To: <35398CE4.6183B8A6@jclark.com> References: <199804151621.MAA02638@unready.microstar.com> <3536AFCD.54B7E396@jclark.com> <199804171143.HAA00485@unready.microstar.com> <353811A1.C4169A40@jclark.com> <199804181417.KAA00374@unready.microstar.com> <35398CE4.6183B8A6@jclark.com> Message-ID: <199804191139.HAA00318@unready.microstar.com> James Clark writes: [on org.xml.sax.CharacterStream and org.xml.sax.ByteStream] > Why are the single character read calls there? They unnecessarily > complicate the interface. For others in the discussion, here's what I have right now: public interface CharacterStream { int read () throws SAXException; int read (char ch[], int start, int length) throws SAXException; } public interface ByteStream { int read () throws SAXException; int read (byte b[], int start, int length) throws SAXException; } I included the single character/byte reads because I did not want to assume that all SAX parsers do their own buffering (of course, the buffering could be handled in the SAX driver layer if necessary). It also seems strange to me to have a streaming interface that does not allow single character/byte reads, though I know that these would be horribly inefficient in a distributed environment where the Parser and the CharacterStream or ByteStream are on different systems. What do the other parser writers think? Does anyone want or need single-character or single-byte reads? I'm very happy to prune SAX wherever I can, before we get to the final release. [on the new InputSource class] > > Now, we can get away with only one parse() method in > > org.xml.sax.Parser: > > > > public abstract void parse (InputSource source) > > throws Exception; > > I don't think this is a good idea: it makes SAX harder to use in the > simple case of reading from a URL. In that case, then, it would probably be best to have two: public abstract void parse (String systemId) throws Exception; public abstract void parse (InputSource source) throws Exception; The first would be the exact equivalent of public void parse (String systemId) throws Exception { parse(new InputSource(systemId)); } Does this seem reasonable? Thanks, and all the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Apr 19 13:55:18 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:34 2004 Subject: SAX: 1998-04-18 pre-release In-Reply-To: <3539A3E8.538AC81B@jclark.com> References: <199804190232.WAA04880@unready.microstar.com> <3539A3E8.538AC81B@jclark.com> Message-ID: <199804191153.HAA00645@unready.microstar.com> James Clark writes: > Making the read methods on ByteStream and CharacterStream throw > SAXException seems wrong to me. In a Java environment I need to be able > to throw an IOException. So they should be declared as throwing either > Exception or IOException. The approach in InputStreamAdapter of just > passing through the message from IOExceptions is not acceptable. An > application may need to deal with different classes of IOExceptions > differently, so it needs to be possible to propagate the IOException up > to Parser.parse. It would be easiest to have them throw Exception, since IOException describes a constraint that will not make sense for a ByteStream or CharacterStream implemented in another language. I have just made this change now, and the only difficulty comes in the optional CharacterStreamAdapter and ByteStreamAdapter classes, where I have to wrap the message from non-IOExceptions in an IOException. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Apr 19 14:02:51 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:34 2004 Subject: SAX: 1998-04-18 pre-release In-Reply-To: <3.0.1.16.19980419090045.0ccf0018@pop3.demon.co.uk> References: <199804190232.WAA04880@unready.microstar.com> <3.0.1.16.19980419090045.0ccf0018@pop3.demon.co.uk> Message-ID: <199804191201.IAA00678@unready.microstar.com> Peter Murray-Rust writes: > I didn't find ROADMAP.txt either in the *.zip, sax.jar or under the URL. Sorry about that -- I'll make certain it ends up in the next one, and can mail it to you privately if you'd like. > >Drivers for Lark and MSXML: > > http://www.microstar.com/XML/SAX/New/saxdrivers-19980418.zip > > This may not be high-priority, but is there any check to make sure that the > driver is accessing the right version of Lark or MSXML? It's quite easy to > get fooled here as one could download a new version of SAX which catered > for a new version of FOO using FOODriver that one wasn't aware of. Tedious. > If the new version used different classes then one could test with the > ClassLoader. I'm hoping that the Lark driver, at least, will end up as part of the main distribution. For MSXML, I can take a look at this issue once SAX proper is out. > >Simple Demos: > > http://www.microstar.com/XML/SAX/New/saxdemos-19980418.zip > > > Demos are always very gratefully received. For a simple-minded person like > me it would be nice to have a two-line demo for CharacterStream input (yes, > I haven't got to grips with java.io.Reader yet). Of the form: > > String input = "Hello world!"; I have one of my own, which isn't much more complicated than this. Here's the interesting part (excluding the handlers, etc.): /** * Main entry point for an application. */ public static void main (String args[]) throws Exception { InputSource source; Parser parser; String doc = "\nHello\nHello, world!\n"; StringReader reader = new StringReader(doc); source = new InputSource(new ReaderAdapter(reader)); if (args.length != 0) { System.err.println("Usage: java CharTest"); System.exit(2); } parser = ParserFactory.makeParser(); parser.setDocumentHandler(new CharTest()); parser.parse(source); } All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From David.Brownell at Eng.Sun.COM Sun Apr 19 16:55:03 1998 From: David.Brownell at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:00:34 2004 Subject: SAX: 1998-04-18 pre-release (I/O) Message-ID: <199804191453.HAA03807@argon.eng.sun.com> This makes it an easy call -- "throws IOException". APIs should never declare "throws Exception" except maybe in the earliest stage of coming up with the exception model. And in this case, "IOException" is how any code doing I/O will already throw exceptions; no need to for more. By the way, the names "ByteStream" and "CharacterStream" imply they're good for writing too. Far preferable to say "InputStream" and "Reader". I/O in languages other than Java should obey those languages' rules, as (and when) the SAX models are translated to them. - Dave > Making the read methods on ByteStream and CharacterStream throw > SAXException seems wrong to me. In a Java environment I need to be able > to throw an IOException. So they should be declared as throwing either > Exception or IOException. The approach in InputStreamAdapter of just > passing through the message from IOExceptions is not acceptable. An > application may need to deal with different classes of IOExceptions > differently, so it needs to be possible to propagate the IOException up > to Parser.parse. > > James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Apr 19 18:53:02 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:34 2004 Subject: SAX: Implied Attributes Message-ID: <3.0.32.19980419093806.00b1c2b0@pop.intergate.bc.ca> At 09:00 PM 4/17/98 -0400, David Megginson wrote: >Should SAX level 1 allow parsers to report implied attributes by >providing null for AttributeList.getValue() (it does not currently do >so)? The only application I can think of right now is architectural >forms, where an implied attribute can prevent automatic derivation. Nothing in the XML spec suggests that anything should be done about #IMPLIED-but-absent attributes... there was such language but we deliberately took it out. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Apr 19 21:22:51 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:34 2004 Subject: SAX: 1998-04-18 pre-release (I/O) In-Reply-To: <199804191453.HAA03807@argon.eng.sun.com> References: <199804191453.HAA03807@argon.eng.sun.com> Message-ID: <199804191921.PAA00277@unready.microstar.com> David Brownell writes: > This makes it an easy call -- "throws IOException". APIs should > never declare "throws Exception" except maybe in the earliest stage > of coming up with the exception model. And in this case, > "IOException" is how any code doing I/O will already throw > exceptions; no need to for more. If I did so, I would need to define the semantics of an IOException within SAX and then require other languages to implement it exactly the same way as Java (so that, say, a Python or C++ implementation could throw an IOException that a Java implementation could catch). It is not acceptable that a SAX implementation on one platform would have to know the programming language of a SAX implementation on another. > By the way, the names "ByteStream" and "CharacterStream" imply > they're good for writing too. Far preferable to say "InputStream" > and "Reader". I/O in languages other than Java should obey those > languages' rules, as (and when) the SAX models are translated to > them. This is a good point, but I don't like the lack of symmetry and transparency in "InputStream" and "Reader". We could use something like "ByteReader" and "CharacterReader", or "ByteInputStream" and "CharacterInputStream" -- there are still a couple of days for suggestions. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Apr 19 21:26:29 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:34 2004 Subject: SAX: Implied Attributes In-Reply-To: <3.0.32.19980419093806.00b1c2b0@pop.intergate.bc.ca> References: <3.0.32.19980419093806.00b1c2b0@pop.intergate.bc.ca> Message-ID: <199804191924.PAA00281@unready.microstar.com> Tim Bray writes: > At 09:00 PM 4/17/98 -0400, David Megginson wrote: > >Should SAX level 1 allow parsers to report implied attributes by > >providing null for AttributeList.getValue() (it does not currently do > >so)? The only application I can think of right now is architectural > >forms, where an implied attribute can prevent automatic derivation. > > Nothing in the XML spec suggests that anything should be done about > #IMPLIED-but-absent attributes... there was such language but we > deliberately took it out. -Tim The XML 1.0 spec does not define most of the information model provided by a parser, so it is not safe to assume that omission means prohibition; however, I agree with you and with James (from a posting a few months ago) that it makes sense to pass over unspecified #IMPLIED attributes in SAX level 1. Thanks, and all the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From stever at orbital.co.uk Sun Apr 19 21:41:17 1998 From: stever at orbital.co.uk (Steve Robertson) Date: Mon Jun 7 17:00:35 2004 Subject: name case-folding Message-ID: <01bd6bcb$578fb1d0$379559c3@platypus.orbital.co.uk> If I remember correctly, the XML specification states that the processor should fold names to uppercase characters for the purpose of attribute and element name comparisons. Do I have this right, or is the case of name characters significant? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Apr 19 23:13:22 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:35 2004 Subject: SAX: Byte Streams and Character Streams Message-ID: <3.0.32.19980419140626.00b07820@pop.intergate.bc.ca> At 08:58 PM 4/17/98 -0400, David Megginson wrote: >James Clark has recently raised the issue of byte streams and >character streams, and I think that we need to give this a thorough >discussion before my self-imposed deadline of next Tuesday. Hmmm, for what it's worth, Lark, both in its current form and after the big performance update coming Real Soon Now, works at approximately equal speed off byte and character streams... the overhead of pouring a buffer's worth of bytes into the internal character buffer that Lark will be reading from is hardly detectable. The next Lark will not be as fast as XP but it won't be that much slower. I'd be interested in what the other parser builders have done in this area. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Apr 19 23:13:27 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:35 2004 Subject: SAX: XML Declaration Message-ID: <3.0.32.19980419141041.00b05440@pop.intergate.bc.ca> At 03:36 PM 4/18/98 -0400, David Megginson wrote: >I have avoided providing information about the XML declaration, >considering it strictly a lexical matter; however, since we are >allowing applications to do their own entity resolution, should they >have access to the value of the 'standalone' pseudo-attribute (if >specified)? I think not, since standalone= is only meaningful in the context of validation. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Apr 19 23:16:48 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:35 2004 Subject: Attribute question Message-ID: <3.0.32.19980419140911.00aff8c0@pop.intergate.bc.ca> At 09:15 AM 4/18/98 -0400, Samuel R. Blackburn wrote: >I'm getting ready to release the next version of >my freeware class library (which includes some >XML capability) and have reached a stumbling >block. > >What does ^ mean (in rule [10])? RTFS. In this case, the notation section. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Apr 19 23:23:54 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:35 2004 Subject: name case-folding Message-ID: <3.0.32.19980419142228.00b10590@pop.intergate.bc.ca> At 08:42 PM 4/19/98 +0100, Steve Robertson wrote: >If I remember correctly, the XML specification states that the processor >should fold names to uppercase characters for the purpose of attribute and >element name comparisons. > >Do I have this right, or is the case of name characters significant? You do not. With a few rare exceptions (e.g. language tags) all names in XML are case-sensitive. My annotated spec (http://www.xml.com/axml/axml.html) has some verbiage as to why this is so, for those around here fortunate enough not to have lived through that debate. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Apr 19 23:23:56 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:35 2004 Subject: SAX: String Internalisation and a CORBA/DCOM Question Message-ID: <3.0.32.19980419141818.00b10aa0@pop.intergate.bc.ca> At 12:28 PM 4/19/98 +0700, James Clark wrote: >> Here's another last-minute SAX question: should org.xml.sax.Parser >> expose a method for internalising strings? >> >> public abstract String intern (String s); > >Absolutely not. I'm with James that the complexity is past the SAX 80/20 point for now. >Doing lots of comparisions on the type of each element whether using >equals or == is not a good way to write an efficient application. For what it's worth, while I agree that this shouldn't go in, I am not convinced by James' argument here; I think the technique of having effectively interned all the element/attribute names allows for an elegant and minimal design in all sorts of applications, particularly the lightweight ones that would want to use a stream interface. >It's >typically better to have a hash-table that associates each element type >with either an integer (which you can then use in a switch statement) or >an object (which you then make a method call on). I think that if you're getting into an application of the class where this type of machinery starts to pay for itself, there's a good chance that you're going to be happier with a DOM/Tree API rather than SAX anyhow. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Apr 19 23:47:16 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:35 2004 Subject: SAX: Byte Streams and Character Streams In-Reply-To: <3.0.32.19980419140626.00b07820@pop.intergate.bc.ca> References: <3.0.32.19980419140626.00b07820@pop.intergate.bc.ca> Message-ID: <199804192145.RAA02679@unready.microstar.com> Tim Bray writes: > Hmmm, for what it's worth, Lark, both in its current form and > after the big performance update coming Real Soon Now, works > at approximately equal speed off byte and character streams... > the overhead of pouring a buffer's worth of bytes into the > internal character buffer that Lark will be reading from > is hardly detectable. The next Lark will not be as fast as XP but > it won't be that much slower. I'd be interested in what the other > parser builders have done in this area. -T. AElfred reads a big buffer (up to 32K) of bytes, then translates it into a big buffer of characters using whatever the current encoding scheme is. Profiling shows that the overhead of doing this is surprisingly low, since it all happens in a tight loop. AElfred can now also read directly from a Reader, bypassing the conversion altogether. The Lark driver in the current pre-release of SAX feeds a character stream to Lark as an InputStream of UTF-8 bytes, using a surprisingly inefficient algorithm that I can fix when I have time. Will the next version of Lark support character streams? All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Apr 20 00:16:32 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:35 2004 Subject: SAX: Byte Streams and Character Streams Message-ID: <3.0.32.19980419151515.00b11760@pop.intergate.bc.ca> At 05:45 PM 4/19/98 -0400, David Megginson wrote: >The Lark driver in the current pre-release of SAX feeds a character >stream to Lark as an InputStream of UTF-8 bytes, using a surprisingly >inefficient algorithm that I can fix when I have time. Will the next >version of Lark support character streams? Well, the current version of Lark really doesn't really support anything *but* character streams... that and synchronization, if my measurements are correct, amount to >50% of the difference between XP and Lark. It is clear and (sigh) not surprising that method-dispatch-per-char is, well, less than optimal. Thus my plan had been to move to a three-arg-read read call. As a result of this, I'm a bit conflicted about James' suggestion that we lose the int read() methods. While they are a surefire way to run slow, I spent enough years in Unix that doing things via getc() feels natural and I appreciate its advantages; assuming of course that getc() is a macro with buffering, which of course a Java method dispatch, uh, isn't. Nice thing about stdio is it made it easy for the programmer to pretend to do character streams without having to really do serious per-char work. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Mon Apr 20 03:26:04 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:00:35 2004 Subject: SAX: New Idea for Entity Resolution References: <199804151621.MAA02638@unready.microstar.com> <3536AFCD.54B7E396@jclark.com> <199804171143.HAA00485@unready.microstar.com> <353811A1.C4169A40@jclark.com> <199804181417.KAA00374@unready.microstar.com> Message-ID: <353AA53F.87A55A43@infinet.com> David Megginson wrote: > James Clark writes: > > > You could just have a class that encapsulates a structure with three > > members: > > > > - a CharacterStream > > - a ByteStream > > - a String > > > > At least one of the CharacterStream and ByteStream must be non-null. If > > the ByteStream is non-null the String can specify the encoding. > > [Read on to the bottom for a large-ish design change.] > > This implies, then, the following three interfaces: > > public interface ByteStream { > public abstract int read () > throws SAXException; > public abstract int read (byte b[], int start, int count) > throws SAXException; > } > > public interface CharacterStream { > public abstract int read () > throws SAXException; > public abstract int read (char ch[], int start, int count) > throws SAXException; > } > > public class InputSource { > // For each variable, imagine a get/set pair instead... > public ByteStream byteStream; > public CharacterStream characterStream; > public String encoding; > } > > The nice thing here is that all of these can live on separate systems > in a distributed environment: the InputSource can be a C-program on a > VAX, the CharacterStream can come a Python program running under alpha > Linux, and the parser can be running in Java on a Windows box. There > is no dependency on language- or system-specific features (except for > java.lang.String, which should be able to map predictably to other > languages). > > Now, why not take this a step further? > > public class InputSource { > // For each variable, imagine a get/set pair instead... > public String publicId; > public String systemId; > public ByteStream byteStream; > public CharacterStream characterStream; > public String encoding; > } > > We'd have to define rules of precedence: > > 1) if there is a character stream, use it; > > 2) if there is no character stream but there is a byte stream, use the > byte stream; > > 3) if there is neither a character stream nor a byte stream but there > is a system identifier, open a connection to the system identifier; > > 4) if there is no character stream, byte stream, or system identifier, > throw an exception (or invoke the ErrorHandler). > > Now, we can get away with only one parse() method in > org.xml.sax.Parser: > > public abstract void parse (InputSource source) > throws Exception; > > It might still be useful to keep two separate methods in > EntityResolver, though: > > public interface EntityResolver > { > public String resolveSystemId (String publicId, String systemId) > throws SAXException; > public InputSource openEntity (String systemId) > throws Exception; > } > > Comments? > > All the best, > > David This sounds like a great idea, however I think that InputSource should be immutable in general. Instead of : public class InputSource { // For each variable, imagine a get/set pair instead... public String publicId; public String systemId; public ByteStream byteStream; public CharacterStream characterStream; public String encoding; } public interface InputSource { String getPublicId(); String getSystemId(); ByteStream getByteStream(); CharacterStream getCharacterStream(); String getEncoding(); } In general, an input source should probably be immutable as the application will actually fill in the blanks as to how the input source should be retrieved. In this sense, the system ID may not help out the parser in the first place if the URL points to an inaccessible location source for the parser alone to read (some sort of encryption of the underlying stream may be present). In this case in your previous aforementioned rules of precedence: We'd have to define rules of precedence: 1) if there is a character stream, use it; 2) if there is no character stream but there is a byte stream, use the byte stream; 3) if there is neither a character stream nor a byte stream but there is a system identifier, open a connection to the system identifier; 4) if there is no character stream, byte stream, or system identifier, throw an exception (or invoke the ErrorHandler). should be changed to something like: We'd have to define rules of precedence: 1) if there is no character stream but there is a byte stream, use the byte stream; 2) if there is no byte stream but there is a character stream, use the character stream; 3) if there is both a character stream and a byte stream available, the parser may use the byte stream or the character stream, but not both at the same time (whichever suits the parser the best). 4) if there is neither a character stream nor a byte stream throw an exception I don't believe the parser should attempt to try and open a connection using the system identifier as the system identifier has no idea what steps to take in order to retrieve the data as a stream, let alone secure authorization to it in the first place. In Java you have URL's and URLHandlers where the URL prefix is used to lookup its corresponding URL prefix. Though programmatically convenient to just call URL.openStream(), other than through setting system properties that the standard URL handlers use for things like proxies or creating your own URLStreamHandlerFactory, there is no good way to control how a specific URL's content is actually retrieved which may need to be piped through a variety of filters before it again in its raw form.. I think it would be a mistake for SAX to inherit this flaw which assumes the parser has access to the specified system identifier in any environment. Force the application to provide a suitable ByteStream and/or CharacterStream for each InputSource provided. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthew at praxis.cz Mon Apr 20 11:19:13 1998 From: matthew at praxis.cz (Matthew Gertner) Date: Mon Jun 7 17:00:35 2004 Subject: Inheritance in XML Message-ID: <01bd6c3a$c50f2910$020b0ac0@xerius> Robin, You really hit the nail on the head with this post! These are exactly the kinds of issues that I was having some trouble expressing in my previous mail. I have read this thread with great interest, and it seems to me that if we synthesize the discussion we are getting close to the heart of the matter. Here is my attempt: * Terminology * I personally don't agree that there are carved-in-stone, well-understood definitions for terms like "inheritance" and "subtyping" in XML. While there surely are in certain, specific contexts, we are talking about something new, i.e. inheritance in XML, and what we really need to do is chose a term and define it precisely. Does HyTime model inheritance? It does if my definition of inheritance in XML corresponds to what HyTime does (it doesn't: see below). Is "subtyping" a better term. No, because it doesn't have the same resonance as the word "inheritance" among non-programmer types. I'll make a first attempt: "Inheritance in XML refers to the process of creating new element types that duplicate the content model and attribute list of existing element types (in the same or a seperate "base" DTD), while extending these to include additional attributes and/or content. As such, instances of the new element types can be used wherever the base element type can be used, and can be processed polymorphically by any external processor which knows about the base element type." * HyTime * I read through Eliot's post and understood some of it. :-) I never meant to question any design decisions made in the specification of HyTime. They are all well-justified in the context which prevailed at the time. Despite the fact that HyTime models derivation (I'll stay away from the i-word in light of the definition given above) of instances and not of schemata, it remains one of the few attempts that have been made at deriving document types and as such is an extremely valuable basis for the thinking about a true inheritance mechanism for XML. To meet the definition I proposed above, this mechanism would have to extend the DTD syntax or create a new one (see below). The goals and uses of HyTime derivation are and will continue to be somewhat different from this; I was only trying to point out that we can benefit greatly from the experience gained from HyTime in thinking about XML inheritance. * Semantics and XML * In last month's Wired, XML made it into the "hype list" with the comment that we crazy XML types are kidding ourselves because XML will never fly without well-defined semantics. These sentiments were echoed by several posts on this list. I agree 100% percent, but as several people pointed out, there are already a lot of semantics associated with XML, to the extent that there are semantics associated with the idea of a hierarchy and with the HAS-A relationship. XML-Link and XSL introduce a very valuable additional set of semantic relationships. We are all so excited about XML, as opposed to Excel files, Postscript or what have you, because there are tools like XML parsers, editors and browsers which have value across the whole range of XML applications. I can write an XML file, and to the extent that existing semantics are sufficient, I can do useful work with this file. I can, for example, display it as a hierarchy. I can't do anything at all with an Excel file unless I have Excel. This doesn't eliminate the need to define the specific semantics of a given schema. This can only be done with clear documentation, as Paul pointed out. What we can do is capture the semantics expressed in this documentation and use them as the basis for new schemata. Sure, a lot of this can be done using "parameter-entity hacks", or by writing content models out by hand, but this isn't going to be an effective way to bring XML to the masses. The whole discussion about XML semantics is very apt in this context precisely because inheritance is so important for making XML really useful. Let me give an example implied by Peter (in reference to the agglutination of DTDs for nuclear power plant software). Let's say that I am developing an advanced medical diagnosis system based on chemical analysis of blood samples. Part of the application is a hardware device which looks for specific molecules in the sample and displays them on a monitor in 3D. I decide to use CML to model these molecules, but I need to add additional attributes and content to the molecule description which are specific to my application. With the kind of inheritance mechanism I am talking about, I could download a CML viewer and use it "out of the box" to display the molecules, while still passing the entire XML structure (with my additional information) to the application with attempts to create a diagnosis. Without XML inheritance, I will probably "break" the viewer, so I find myself wading through and adapting a lot of Java code. At this point I start wondering why I decided to use XML in the first place... * DTDs and schemata * Francois Chahuneau's article makes a very effective argument for why we need to extend or replace DTD syntax (thanks Robin). XML-Data is a reasonable attempt to do so, but it is understandly controversial because it is a such a radical departure from the existing syntax. I quite like the idea of an alternate, XML-based schema syntax, but the real lesson of XML-Data is that creating an effective inheritance mechanism isn't rocket science. All that is really needed is a keyword that says "this element type is derived from that element type". Something like: This would simply mean that the breed element precedes the content of the base element type, which is then followed optionally by some flea elements. This approach is probably sufficient, since other modifications to the base content model could be taken into account in the design phase of the base schema (i.e. by breaking up monolithic elements, if necessary). * What now? * More tricky than any of these technical issues is the question of what, if anything, could be done to promote a mechanism of this sort. Obviously this would require a change to the XML spec as well as modification to all existing tools which process DTDs, so it's a pretty big deal. I wonder if anyone besides me thinks that a simple mechanism like this would make sense. If so, is there any room in the XML standards process to discuss a change of this type at some point in the future (certainly not for XML 1.0)? Cheers, Matthew -----Original Message----- From: Robin Cover To: xml-dev@ic.ac.uk Date: Saturday, April 18, 1998 7:37 PM Subject: Re: Inheritance in XML >> Re: Subject: Re: Inheritance in XML (was Re: Problems parsing XML) >> Date: Sat, 18 Apr 1998 08:49:07 +0100 >> Reply-To: "Martin Bryan" > >>>What is missing is that we can't define one class (element type) as a >>>subtype of another. > >> In SGML you can use exclusions to make an element a true subclass of >> another: >> >> >> >> providing a, b and c are optional components within the model for Y. >> Unfortunately XML dropped this useful option from the set of SGML facilities >> it in inherited >> >> Martin Bryan > >Martin, I wish I could believe this were true and useful. It seems >that we confront here one of the several troublesome mismatches >between OO database modeling and SGML/XML markup, with respect to >the simple analogy: > >OODB SGML/XML Markup > >class defn element declaration >class name element type >object element >attribute attribute > >If we accept this crude analogy, and accept SGML's notion of an >"attribute" as a name-value pair, then the hope of creating subclasses >through SGML/XML element declarations appears slim. Appears "to me" I >should say: I would welcome comments from the experts. > >For starters, subclassing normally would mean further specialization >by the addition (possibly 'plus subtraction') of properties, viz., of >attributes. Formally, then, an SGML element declaration can't do the >work: it would need to be an ATTLIST declaration. But then we face >the problem that you can't model a complex attribute with the SGML >'attribute' anyway (if you want any validation): the "value" in >'(name-)value' is a flat/string in SGML, at least in the literal sense. > >Of course, one can (and we all do) model "real" attributes using SGML >elements -- since we have no realistic alternative -- but that creates >other problems for the notion of using SGML element decls as a >subclassing mechanism. One such problem is that (real) attributes are >unordered. The straightforward way to model an object/element with >(some optional, some required) attributes a, b, c, d, e, and f would >seem to be: (a* & b? & c? & d & e?), but SGML/XML notions of >prescribing order in the serialization are fairly strong, and XML >won't even allow the use of the AND connector to indicate what I >plainly mean in this sample assertion. (Perhaps Steph Tryphonas has >written a program by now to convert all content models using AND to >use only OR, without sacrificing any integrity constraints on >occurrence and sequence). In any case, the impulse toward >serialization in SGML -- at least in practice, given tools that force >end users to reckon with (arbitrary non-intuitive) "order" based upon >sequence rules in content models -- tends to work against the easy use >of SGML elements to model attributes. > >Even apart from these mismatches between "object" modelling >and SGML/XML encoding, I question whether > > " " > >creates a useful "true subclass." Why would one want to create a >subclass based upon the subtraction of optional "attributes" >(subelements)? I think that would make it a superclass in many OO >systems. In this connection, one might be inclined to argue that the >treatment of "content" as a special attribute is unfortunate, at least >from the perspective of data modelling, where "part-whole" has no >quintessential role vis-a-vis "is-a" or "has-a" or "kind-of" or >"points-to"... At which point, others would quickly point out that >they think it's specious to be talking about object modeling in terms >of SGML-based markup languages anyway, since "these languages can >neither formally express nor enforce semantic integrity constraints >which are so critical to good object modelling..." > >I think this all leads me in the direction of favoring the efforts >at defining other schema languages (beyond SGML/XML DTD syntax), >granting that the validation of instances against their schemas, >if/when critical, will need to be done outside the framework of >the SGML/XML "parser/processor" as defined. I have little doubt >that someone as brilliant as Eliot can show how the desired >objectives might be met through architecture processing by an >appropriate architecture engine; I don't know whether this is the >"best" path in all cases, or whether SGML/XML users will want to >deal with all the layers of indirection that architectures seem to >want. > >I hope that experts with some years of experience in OO systems >will contribute their insights to the new "schema" projects. > >-rcc > >------------------------------------------------------------------------- >Robin Cover Email: robin@acadcomp.sil.org >6634 Sarah Drive >Dallas, TX 75236 USA >>> The SGML/XML Web Page <<< >Tel: +1 (972) 296-1783 (h) http://www.sil.org/sgml/sgml.html >Tel: +1 (972) 708-7346 (w) >FAX: +1 (972) 708-7380 >========================================================================= > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From h.rzepa at ic.ac.uk Mon Apr 20 12:06:51 1998 From: h.rzepa at ic.ac.uk (Rzepa, Henry) Date: Mon Jun 7 17:00:35 2004 Subject: LISTADMIN: List stats, and archival Message-ID: Dear all, In 15 months there have been 3188 postings from 899 subscribers (with 309 subscribers to the digest list). Around October this year, we will be publishing the proceedings of electronic conference, and there will be an opportunity to include the archive of xml-dev on the CD ROM. Alternatively, Peter Murray-Rust is intending to publish his JUMBO2 tutorials etc in CD. It does seem worthwhile to try to preserve in some form the history of a list such as XML-DEV.Too often, such lists seem to evaporate forever. I propose therefore to transfer the archive of this list to a published CD ROM. Lack of resources will prevent anything other than the lightest of editing(removing duplicate footers, etc), unless someone offers to do so with any intelligent parsing tools they may have. Help/suggestions most welcome. Dr Henry Rzepa, Dept. Chemistry, Imperial College, LONDON SW7 2AY; mailto:rzepa@ic.ac.uk; Tel (44) 171 594 5774; Fax: (44) 171 594 5804. URL: http://www.ch.ic.ac.uk/rzepa/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tug at wilson.co.uk Mon Apr 20 12:21:20 1998 From: tug at wilson.co.uk (John Wilson) Date: Mon Jun 7 17:00:36 2004 Subject: SAX: Parser Factory class Message-ID: <00aa01bd6c45$e3462450$0a01d30a@bach.wilson.co.uk> Skipped content of type multipart/alternative-------------- next part -------------- A non-text attachment was scrubbed... Name: John Wilson.vcf Type: text/x-vcard Size: 498 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19980420/601368f0/JohnWilson.vcf From ak117 at freenet.carleton.ca Mon Apr 20 15:04:21 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:36 2004 Subject: LISTADMIN: List stats, and archival In-Reply-To: References: Message-ID: <199804201301.JAA00555@unready.microstar.com> Rzepa, Henry writes: > It does seem worthwhile to try to preserve in some form the history > of a list such as XML-DEV.Too often, such lists seem to evaporate > forever. I propose therefore to transfer the archive of this list > to a published CD ROM. Lack of resources will prevent anything > other than the lightest of editing(removing duplicate footers, > etc), unless someone offers to do so with any intelligent parsing > tools they may have. How about using a trivial Perl script to convert all of the messages to a simple XML document type (assuming nothing about the semantics of the message body itself)? You could try something like the following:
David Megginson ak117@freenet.carleton.ca February 18, 1998 re: Some Subject
This is whatever appeared in the body of the message, only with XML characters like <, >, and & escaped, form feeds stripped out, and everything above 0x8f converted to a character reference. David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/
This would make it simpler to use the archive with an XML search engine later on, and would provide a nice (and very large) base of sample XML documents. Here's the external DTD subset for the example (message.dtd): All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Apr 20 15:05:44 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:36 2004 Subject: Inheritance in XML References: <01bd6c3a$c50f2910$020b0ac0@xerius> Message-ID: <353B47F9.54285490@technologist.com> Matthew Gertner wrote: > > * Terminology * > > I personally don't agree that there are carved-in-stone, well-understood > definitions for terms like "inheritance" and "subtyping" in XML. I don't think that anyone claimed that there is a well-understood definition for "inheritance" in any context -- even OO. But to be consistent with English, it must have something to do with "getting something for free." In the XML context the most obvious thing would be declarations. Subtyping is different. Subtyping comes straight from mathematics and is as old as logic (at least). A type defines a set of objects. A subtype describes a subset of those objects. Simple and precise. > Is > "subtyping" a better term. No, because it doesn't have the same resonance as > the word "inheritance" among non-programmer types. I don't know why you think that. Non-programmer types are likely to balk at either word, but at least subtyping is shorter, and can be precisely defined. Anyhow, it is not at all like the words are interchangable. You can't pick and choose from words that already have meanings. > I'll make a first attempt: > "Inheritance in XML refers to the process of creating new element types that > duplicate the content model and attribute list of existing element types (in > the same or a seperate "base" DTD), while extending these to include > additional attributes and/or content. As such, instances of the new element > types can be used wherever the base element type can be used, and can be > processed polymorphically by any external processor which knows about the > base element type." ACK! This definition was proven inadequate in the OO software world around a decade ago. Both C++ and Java allow subtyping without inheritance, and C++, Sather and Eiffel allow inheritance without subtyping (I suppose to get that in Java, you would have to use delegation). If we are going to borrow ideas from OO, then we should at least use the updated, modern ideas, not those that were accidently confused in Simula 67 (and have been confused in programmers minds ever since). The first major problem with your definition actually has nothing to do with the inheritance/subtyping conundrum. The biggest problem is that if you "extend" a content model, you are making a more flexible language, which *cannot* be processed polymorphically by an external processor which knows nothing about the base element type: Now imagine software that generates a TOC from titles, presuming them to be strictly textual. What does it do with images in titles? Now let's talk about inheritance and subtyping. This is not a merely theoretical issue. It has important practical implications. The most interesting, important application of subtyping is allowing divergent evolution of compatible schemas. This is why architectural forms were invented. But for this to work, subtyping *must* be unhitched from inheritance. Suppose that Boeing has a content model: Bombardier has a similer model (after all, they are modelling the same thing): How does inheritance help me to unify these models and validate that they are actually isomorphic? It doesn't. This is a job for subtyping. I can also come up with examples where inheritance is more useful without subtyping but you can always achieve this through other means (which is why Java does not support it). Inheritance is a code reuse mechanism, so you can always emulate it with cut and paste (or, parameter entities, or in a programming language with delegation). Subtyping is a type system extension. It is completely different. I can inherit stuff from my dad without becoming a dad. I can choose to be a dad without inheriting anything either from my dad, or the "class dad". They are different things. > * DTDs and schemata * > > Francois Chahuneau's article makes a very effective argument for why we need > to extend or replace DTD syntax (thanks Robin). XML-Data is a reasonable > attempt to do so, but it is understandly controversial because it is a such > a radical departure from the existing syntax. I think that XML-Data should be controversial because from my reading it is just a mix and match combination of interesting features that people want in schemas without a coherent theory of how they should fit together. You can't just put 10 smart people into a working group and have them throw in their good ideas and expect a coherent result. XML-Data's inheritance mechanism does not take advantage of XML's nature as a sequence-oriented language for encoding documents. In other words, it doesn't solve the fundamental problem. > I quite like the idea of an > alternate, XML-based schema syntax, but the real lesson of XML-Data is that > creating an effective inheritance mechanism isn't rocket science. All that > is really needed is a keyword that says "this element type is derived from > that element type". Something like: > > More tricky than any of these technical issues is the question of what, if > anything, could be done to promote a mechanism of this sort. Obviously this > would require a change to the XML spec as well as modification to all > existing tools which process DTDs, so it's a pretty big deal. I wonder if > anyone besides me thinks that a simple mechanism like this would make sense. > If so, is there any room in the XML standards process to discuss a change of > this type at some point in the future (certainly not for XML 1.0)? Personally, I have yet to see a decent proposal for inheritance and subtyping in SGML. Coming up with ibe is difficult, which is why I've spent the last year thinking about it. Dan Connolly has also spent several years thinking about it. I know that there are many others in the same boat. I think that we agree that it doesn't make sense to adopt a solution that solves only 5% of the problem, which is why you will see resistance to anything like that. We will know that we have a complete solution to the problem when HTML 6.0 can be described as a subtype of HTML 5.0, and its behaviour in a "subtype aware" HTML 5.0 browser is predictable and well-defined. Further, HTML 6.0 must not just extend HTML 5.0 in trivial ways such as new tags. It must actually have new elements, with new content models mixed in at all levels. As I said, inheritance-at-the-end solves about 5% of this problem. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Journalism is good if you follow the rules. Don't allow the human rights groups to spoil your profession" - Col. Godwin Ugbo of the Nigerian military dictatorship xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Apr 20 15:15:22 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:36 2004 Subject: SAX: single-character read() Message-ID: <199804201312.JAA00642@unready.microstar.com> Upon reflection, I am quite convinced that we do not need a single-character/byte read() method in the SAX CharacterStream and ByteStream interfaces. If any individual parser needs a single-character read, it can implement a buffering layer in its SAX driver; it does not make sense to require the application writers to implement that buffer scheme themselves every time they create a character or byte stream. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Apr 20 15:20:43 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:36 2004 Subject: Inheritance and subtyping in OO languages Message-ID: <353B4B8C.EDBFA0F7@technologist.com> I've found a good reference to the 8 year old paper that made the distinction between inheritance and subtyping most explicit. The paper itself is not online, but this summary is quite good: "[CCHO89] and [CoHC90] propose an approach based on explicit interfaces and interface containment. In this system of object interfaces, one type is considered a subtype of another if some subset of its interface is identical to that of the second. [...] Hence in this system class-based inheritance is strictly a reusability mechanism for sharing behaviour between objects, not to be confused with subtyping. For example two classes may be equivalent as types, though neither inherits anything from the other. So class hierarchies are not the same as type hierarchies, although they may overlap. Object interfaces [as in Java, C++, etc. - Paul] clarify this distinction between interface containment (subtyping) and class- based inheritance and give insight into limitations caused by equating the notions of type and class in many typed object-oriented programming languages [such as Simula 67 - Paul]." http://progwww.vub.ac.be/prog/persons/kimmens/research/Introduction-to-OO.html The paper itself is called: "Inheritance is not subtyping" and is quite famous, but unfortunately predates the Web. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Journalism is good if you follow the rules. Don't allow the human rights groups to spoil your profession" - Col. Godwin Ugbo of the Nigerian military dictatorship xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Apr 20 16:06:48 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:36 2004 Subject: SAX: 1998-04-20 Pre-Release, with Road Map and Demos Message-ID: <199804201402.KAA02839@unready.microstar.com> There is a new SAX pre-release available through the following URL: http://www.microstar.com/XML/SAX/New/ Thank you very much to everyone who has taken the time to download, test, and comment on the previous pre-releases. The zip files referenced on the page mentioned above include a road map of the core SAX classes and interfaces, updated drivers for Lark and MSXML, and well-commented demonstrations of parsing from a system identifier, a byte stream, and a character stream. The core changes from the 1998-04-18 pre-release are very small, leading me to hope that SAX is stabilising in time for tomorrow evening's deadline: - added parse(String systemId) convenience method to Parser - removed single-character read() method from CharacterStream, ReaderAdapter, and CharacterStreamAdapter - removed single-byte read() method from ByteStream, InputStreamAdapter, and ByteStreamAdapter I am not including JavaDoc documentation in the distribution, but it is easy to generate if you are interested. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Kenneth.J.Meltsner at jci.com Mon Apr 20 16:08:44 1998 From: Kenneth.J.Meltsner at jci.com (Meltsner, Kenneth J) Date: Mon Jun 7 17:00:36 2004 Subject: Composition of DTDs was Inheritance in XML Message-ID: <862565EC.004D6DEA.00@Corpnotes.JCI.Com> I think the value of XML in these cases is its "composability" - the ability to develop a new DTD by "adding" together previously defined DTDs. If I want to write the perfect catalog DTD, I don't have to reinvent linking or styles; I can concentrate on modeling the relationships between parts and systems and part numbers instead. Aggregation/composition is a legitimate alternative to inheritance for many applications, but tends to get used less often in class-instance object systems. In prototype-based OO systems, like the UI toolkits Garnet and Amulet, it's the usual way of building new objects or of specializing the behavior of old ones. The namespace spec goes a long way to making this more possible. Ken Meltsner -----Original Message----- From: Peter Murray-Rust [...] This is actually a good example of where I think XML has a lot to offer. In designing complex systems it is a good idea to re-use well-tested components where possible. If, for example, your power station relies on mathematics, physics, chemistry, safety protocols, etc. then it will make sense to re-use those developed in a community-wide fashion. In an XML document it is straightforward to detect the namespaces used and to separate the components. I find it much easier to extract the separate information components from an XML file than (say) an RTF document. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From amarshal at usc.edu Mon Apr 20 16:45:15 1998 From: amarshal at usc.edu (Andrew n marshall) Date: Mon Jun 7 17:00:36 2004 Subject: SAX: New Idea for Entity Resolution In-Reply-To: <199804191139.HAA00318@unready.microstar.com> Message-ID: <000501bd6c6b$db200090$1de37d80@philica> IBM's XML for Java has an interface called StreamProducer that is very similar to your SAXEntityResolver. If you look at the new version Kent Tamura released last Friday, you'll notice that he also added the closeStream(InputStream) method to the StreamProducer interface. While this seems strange at first, when I recompiled my code implementing the new method, I realized why this was necessary: InputStream.close() throws an exception and this is a simple way of adding a means to catch, and deal with, the error. Perhaps SAX should have something similar. Maybe I thinking in terms that are too Java dependant, but the idea makes sense to me. Andrew n marshall ??student - artist - programmer ???? http://www.media-electronica.com/anm-bin/anm ??????"Everyone a mentor, Everyone a pupil" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Apr 20 16:48:39 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:36 2004 Subject: SAX: Distributed Implementations In-Reply-To: <353B58DE.765496EB@eng.sun.com> References: <199804191453.HAA03807@argon.eng.sun.com> <199804191921.PAA00277@unready.microstar.com> <353B58DE.765496EB@eng.sun.com> Message-ID: <199804201444.KAA03195@unready.microstar.com> David Brownell writes: > The normal approach is to say that issues like I/O streams are done > the usual way for those platforms. While I've seen some approaches > that assume all platforms do I/O the same way, they weren't widely > accepted since programmers already "know" how to do I/O and don't > really want new APIs that do the standard things in different ways. > The value of the common framework isn't for the stuff that already > exists (I/O), but for the new stuff (in this case, XML parser > callbacks). I do not understand, though, how this would allow a SAX implementation to be distributed across several platforms. Imagine this: - there is a SAX Parser object implemented in C++ on host A - there is a SAX application implemented in Java on host B If the application wants to provide a character stream to the parser (a very typical case), how can it do so if host A and host B have implemented character streams differently? The only solution that I can imagine is for one of the two to have special knowledge of the other's implementation language, and to provide a special adapter class to translate from standard Java I/O to standard C++ I/O; what would happen, then, if we added host C with a Python implementation, host D with a Perl implementation, and host E with an ECMAScript implementation? Would every host have to have an adapter for every other host's implementation language? There may be an obvious solution to this problem -- as I've mentioned before, I'm very new to CORBA in particular and to distributed computing in general -- so I'm very grateful for comments from people with experience in this area. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Apr 20 17:05:58 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:36 2004 Subject: SAX: close() method for streams In-Reply-To: <000501bd6c6b$db200090$1de37d80@philica> References: <199804191139.HAA00318@unready.microstar.com> <000501bd6c6b$db200090$1de37d80@philica> Message-ID: <199804201503.LAA03359@unready.microstar.com> Andrew n marshall writes: > IBM's XML for Java has an interface called StreamProducer that is > very similar to your SAXEntityResolver. If you look at the new > version Kent Tamura released last Friday, you'll notice that he > also added the closeStream(InputStream) method to the > StreamProducer interface. While this seems strange at first, when > I recompiled my code implementing the new method, I realized why > this was necessary: InputStream.close() throws an exception and > this is a simple way of adding a means to catch, and deal with, the > error. > > Perhaps SAX should have something similar. Maybe I thinking in terms that > are too Java dependant, but the idea makes sense to me. There is still time left to add this if we need it. I had assumed that the application would be responsible both for opening and for closing the stream -- is there any reason to provide a way for the parser to signal that it's not trying to read any more bytes or characters, when the parser will try to keep reading until EOF or an error anyway? What does everyone else think? Thanks, and all the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Apr 20 17:26:54 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:36 2004 Subject: Inheritance in XML References: <199804181718.MAA24752@ACADCOMP.SIL.ORG> Message-ID: <353B689F.7032A393@technologist.com> Robin Cover wrote: > > OODB SGML/XML Markup > > class defn element declaration > class name element type > object element > attribute attribute > > If we accept this crude analogy, and accept SGML's notion of an > "attribute" as a name-value pair, then the hope of creating subclasses > through SGML/XML element declarations appears slim. I don't think tha the problem is with SGML/XML element type declarations. I think that it is with trying to import too literally OO features. The most important thing about an object is its set of "methods" or "slots". These define its interface. The most important thing about an XML element is its content model, or, more generally, the language it defines (content model+attributes). But languages and methods are very different. If we made XML's attributes "richer", we could have attributes that are more like properties. But the content model problem would remain unless we removed content models altogether. OOP works because they figured out a smart way of defining interfaces (sets of methods) and sub-interfaces (subsets of methods). We must do the same for languages. The problem is easy if we strictly require subtypes to define sublanguages (i.e. merely restricted content models). That would occasionally be useful: But more often we want not just a strict sublanguage, but a language that can be *transformed into* a sublanguage. For example: To me, this is much more interesting and useful, but also harder to figure out, especially when we use the full power of content models. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Journalism is good if you follow the rules. Don't allow the human rights groups to spoil your profession" - Col. Godwin Ugbo of the Nigerian military dictatorship xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Mon Apr 20 18:04:40 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:00:37 2004 Subject: GedML: Genealogical Data in XML Message-ID: <005101bd6c76$0ba5d500$1e09e391@mhklaptop.bra01.icl.co.uk> Announcing GedML I have put together a proposal and some simple software for handling genealogical data in XML. This takes the data model of the well-established GEDCOM standard and represents it with the encoding syntax of XML. The immediate benefit, I hope, is that it becomes much easier to write applications that process the data, because you don't have to worry about the parsing, character encoding, etc. Details on http://home.iclweb.com/icl2/mhkay/gedml.html All comments welcome, especially (from this group) on the DTD design. But bear in mind that 100% object model compatibility with GEDCOM was a key objective. "Compatibility means deliberately repeating other people's mistakes". Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthew at praxis.cz Mon Apr 20 19:42:37 1998 From: matthew at praxis.cz (Matthew Gertner) Date: Mon Jun 7 17:00:37 2004 Subject: Inheritance in XML Message-ID: <01bd6c80$56217990$020b0ac0@xerius> Paul, Let me try to explain at least what I am envisioning as far as the putative inheritance mechanism which I described is concerned. I am going to get myself into trouble by saying this, but SGML was an attempt to avoid doing 5% of what is necessary (i.e. to do everything), and this led 10 years down the line to the creation of XML. The same applies to HyTime. This doesn't mean in any way, shape or form that either SGML or HyTime are not wonderful things. They are absolutely amazing. But in many, many instances what is needed is a more simple language/mechanism/etc. which only does 5% (I'd rather bump this up to at least 50% personally...) but which are more approachable. Isn't this what XML is all about? A very uncharitable view (which I would certainly not endorse :-) would be to say that you are setting the bar very high, and then rejecting ideas which at least one person in this list thinks would be of great value because they don't reach the level of functionality that you are imposing. I don't get your point entirely in regards to subtyping and inheritance. Let me explain how I understand the conventional OO usage of these terms. Inheritance means that some thing gets some of its attributes/methods from a base thing. Of course this doesn't necessarily make the derived thing a subtype of the base thing: it may just inherit data members and not interfaces. Subtyping means that some thing can be treated as another thing in some cases because it shares at least one of that things interfaces. So what I was describing is definitely inheritance and should reasonably be considered subtyping as well (although we aren't actually talking about interfaces). It seems to me that your objection to this boils down to the fact that you also want to subtype without inheritance. I am approaching the problem from a different standpoint. I appreciate your reaction to the extent that you and many other people who know far more about these issues than I have given years of thought to them and not come up with an entirely satisfactory solution. I started thinking about how to subtype content models (through inheritance) in an entirely flexible way and gave up almost immediately. It's a hard, hard problem. On the other hand, C++ has gotten away with providing subtyping only through inheritance (you mentioned that it can but I can't figure out how - please enlighten), and it's still a pretty useful little language. We have the advantage now of being at a new frontier: XML. There aren't many standard XML DTDs to speak of, and certainly none that are built to exploit subtyping through inheritance. However, if such a mechanism existed (and as I say, it isn't rocket science), I truly believe that it would be quite feasible to design small "component DTDs" which could be usefully extended without needing to map element types or get into the guts of the content model. What I was implying in my example about CML were 3 things: 1) The processor has access to the base DTD. 2) The processor has access to the derived DTD. 3) The processor knows about the inheritance (which is also a subtyping :-) mechanism being used. This would enable it to get at the content of the base element type without knowing what to do with the content of the derived element type. This can't be done with cut and paste. There is some scope for ambiguity here, but I can't think of any examples that do anything really useful, so they could just be forbidden (i.e. sticking a (foo*) in front of a content model that starts with its own (foo*)). In your example, you would need to extend the processor to deal with images in titles, but at least it wouldn't break older processors, which would still display the text of the title. So we really are talking about two different things. HyTime does a great job with things like mapping element type names. It isn't going to die or go away, and companies like Boeing and Bombadier who need that kind of functionality and can afford to invest in it and climb the learning curve are going to chose to use it. All that I'm saying is that analogous to the way that XML tries to broaden the market for a lot of the great ideas in SGML by simplifying it, we need a simple inheritance mechanism (that implements subtyping) to be used with XML. Once again, this only makes sense if DTDs are designed to take advantage of this mechanism and if there is some central body for gathering these DTDs and their associated documentation and ensuring that overlap doesn't occur. All I want is to be able to do is scoot over to the DTD repository site, check for a standard DTD for invoices, grab it, extend it with the two or three extra attributes and/or contained element types that I need and use it, while still being able to use any tools that are designed to work with the original invoice DTD. I truly believe that this is where XML will really start to fulfill its promise. But then I may be crazy... Matthew -----Original Message----- From: Paul Prescod To: xml-dev@ic.ac.uk Date: Monday, April 20, 1998 3:33 PM Subject: Re: Inheritance in XML >Matthew Gertner wrote: >> >> * Terminology * >> >> I personally don't agree that there are carved-in-stone, well-understood >> definitions for terms like "inheritance" and "subtyping" in XML. > >I don't think that anyone claimed that there is a well-understood >definition for "inheritance" in any context -- even OO. But to be >consistent with English, it must have something to do with "getting >something for free." In the XML context the most obvious thing would be >declarations. > >Subtyping is different. Subtyping comes straight from mathematics and is >as old as logic (at least). A type defines a set of objects. A subtype >describes a subset of those objects. Simple and precise. > >> Is >> "subtyping" a better term. No, because it doesn't have the same resonance as >> the word "inheritance" among non-programmer types. > >I don't know why you think that. Non-programmer types are likely to balk >at either word, but at least subtyping is shorter, and can be precisely >defined. Anyhow, it is not at all like the words are interchangable. You >can't pick and choose from words that already have meanings. > >> I'll make a first attempt: >> "Inheritance in XML refers to the process of creating new element types that >> duplicate the content model and attribute list of existing element types (in >> the same or a seperate "base" DTD), while extending these to include >> additional attributes and/or content. As such, instances of the new element >> types can be used wherever the base element type can be used, and can be >> processed polymorphically by any external processor which knows about the >> base element type." > >ACK! This definition was proven inadequate in the OO software world >around a decade ago. Both C++ and Java allow subtyping without >inheritance, and C++, Sather and Eiffel allow inheritance without >subtyping (I suppose to get that in Java, you would have to use >delegation). If we are going to borrow ideas from OO, then we should at >least use the updated, modern ideas, not those that were accidently >confused in Simula 67 (and have been confused in programmers minds ever >since). > >The first major problem with your definition actually has nothing to do >with the inheritance/subtyping conundrum. The biggest problem is that if >you "extend" a content model, you are making a more flexible language, >which *cannot* be processed polymorphically by an external processor >which knows nothing about the base element type: > > > > >Now imagine software that generates a TOC from titles, presuming them to >be strictly textual. What does it do with images in titles? > >Now let's talk about inheritance and subtyping. This is not a merely >theoretical issue. It has important practical implications. The most >interesting, important application of subtyping is allowing divergent >evolution of compatible schemas. This is why architectural forms were >invented. But for this to work, subtyping *must* be unhitched from >inheritance. > >Suppose that Boeing has a content model: > > > >Bombardier has a similer model (after all, they are modelling the same >thing): > > > >How does inheritance help me to unify these models and validate that >they are actually isomorphic? It doesn't. This is a job for subtyping. I >can also come up with examples where inheritance is more useful without >subtyping but you can always achieve this through other means (which is >why Java does not support it). > >Inheritance is a code reuse mechanism, so you can always emulate it with >cut and paste (or, parameter entities, or in a programming language with >delegation). Subtyping is a type system extension. It is completely >different. > >I can inherit stuff from my dad without becoming a dad. I can choose to >be a dad without inheriting anything either from my dad, or the "class >dad". They are different things. > >> * DTDs and schemata * >> >> Francois Chahuneau's article makes a very effective argument for why we need >> to extend or replace DTD syntax (thanks Robin). XML-Data is a reasonable >> attempt to do so, but it is understandly controversial because it is a such >> a radical departure from the existing syntax. > >I think that XML-Data should be controversial because from my reading it >is just a mix and match combination of interesting features that people >want in schemas without a coherent theory of how they should fit >together. You can't just put 10 smart people into a working group and >have them throw in their good ideas and expect a coherent result. >XML-Data's inheritance mechanism does not take advantage of XML's nature >as a sequence-oriented language for encoding documents. In other words, >it doesn't solve the fundamental problem. > >> I quite like the idea of an >> alternate, XML-based schema syntax, but the real lesson of XML-Data is that >> creating an effective inheritance mechanism isn't rocket science. All that >> is really needed is a keyword that says "this element type is derived from >> that element type". Something like: >> >> >Sure. This isn't rocket science. But it doesn't solve the fundamental >problem at all. You haven't defined what happens to "BARK" sub-elements >in "DOG". Without that definition, any software dealing with animals >will croak on dogs. Which is exactly what subtyping was supposed to >avoid.... > >> More tricky than any of these technical issues is the question of what, if >> anything, could be done to promote a mechanism of this sort. Obviously this >> would require a change to the XML spec as well as modification to all >> existing tools which process DTDs, so it's a pretty big deal. I wonder if >> anyone besides me thinks that a simple mechanism like this would make sense. >> If so, is there any room in the XML standards process to discuss a change of >> this type at some point in the future (certainly not for XML 1.0)? > >Personally, I have yet to see a decent proposal for inheritance and >subtyping in SGML. Coming up with ibe is difficult, which is why I've >spent the last year thinking about it. Dan Connolly has also spent >several years thinking about it. I know that there are many others in >the same boat. I think that we agree that it doesn't make sense to adopt >a solution that solves only 5% of the problem, which is why you will see >resistance to anything like that. > >We will know that we have a complete solution to the problem when HTML >6.0 can be described as a subtype of HTML 5.0, and its behaviour in a >"subtype aware" HTML 5.0 browser is predictable and well-defined. >Further, HTML 6.0 must not just extend HTML 5.0 in trivial ways such as >new tags. It must actually have new elements, with new content >models mixed in at all levels. As I said, inheritance-at-the-end solves >about 5% of this problem. > > Paul Prescod - http://itrc.uwaterloo.ca/~papresco > >"Journalism is good if you follow the rules. Don't allow the human >rights groups to spoil your profession" > - Col. Godwin Ugbo of the Nigerian military dictatorship xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Apr 20 20:56:33 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:37 2004 Subject: SAX: Distributed Implementations Message-ID: <199804201849.OAA05504@unready.microstar.com> David Brownell writes: > I'd probably not split a real application in that particular way, > though. The latency penalty for lots of fine grained syntax > callbacks hurts, and distributed systems are generally designed to > ship bulk data (such as an XML message) and process it locally > (such as parse, interpret, respond to some purchase order in XML > while updating several databases). HTTP is only one of the more > visible examples of that trend. This wouldn't be too much of a problem with a remote character or byte stream, especially since we've removed the single-character and single-byte read(). AElfred, for example, slurps up 32K at a time into its read buffer. By the way, Java is simply the initial implementation for SAX, but it is not intended to be the only one. Ideally, I should have used OMG-IDL from the start, but many more people understand (and can use) Java, so I started there. Several people have offered to write OMG-IDL versions of the interfaces as soon as we're done defining them. There is already a Python implementation of an early draft of the SAX interface, and I might take a stab at the C++ version myself. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Apr 20 20:57:39 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:37 2004 Subject: SAX: close() method for streams In-Reply-To: <353B8C43.12267DB@eng.sun.com> References: <199804191139.HAA00318@unready.microstar.com> <000501bd6c6b$db200090$1de37d80@philica> <199804201503.LAA03359@unready.microstar.com> <353B8C43.12267DB@eng.sun.com> Message-ID: <199804201851.OAA05515@unready.microstar.com> David Brownell writes: > My email to the list has _all_ bounced today ... > > Re is this needed, the issue is indeed whether the application can > tell that it's OK to recycle the resource. That'd ideally be a > close method on the stream (the normal pattern). The application will know that the parser is finished with the stream when it receives an endDocument() event; however, there will be cases where the CharacterStream or ByteStream implementation will not be too tightly coupled with the handlers, and there will be others where the application has not implemented the DocumentHandler interface; all-in-all, then, adding close() seems like a good suggestion. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Bryan_Gilbert at pml.com Mon Apr 20 21:02:21 1998 From: Bryan_Gilbert at pml.com (Bryan Gilbert) Date: Mon Jun 7 17:00:37 2004 Subject: Abbreviated end tags found in a XML file Message-ID: At this site http://www.microsoft.com/msdn/sdk/inetsdk/samples/databind/composer.xml you will find a XML file that contains content like this: Hector Berlioz 1803 1869 France Renowned French composer known for Symphonie Fantastique written in 1830. Notice that end tags do NOT contain the element name. >From my reading of the XML spec this file is not well formed. But is it acceptable? Bryan Gilbert, B.Sc. M.Sc. Systems Engineer, ION Modules Team, Power Measurement Ltd 2195 Keating Cross Road, Saanichton, BC, Can, V8M 2A5 Phone: (250) 652-7100 ext 7570 Fax: (250) 652-0411 email: (mailto:bryan_gilbert@pml.com) WWW: (http://www.pml.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Apr 20 21:16:16 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:37 2004 Subject: Abbreviated end tags found in a XML file Message-ID: <3.0.32.19980420121459.009cdb40@pop.intergate.bc.ca> At 12:01 PM 4/20/98 -0700, Bryan Gilbert wrote: >At this site >http://www.microsoft.com/msdn/sdk/inetsdk/samples/databind/composer.xml >you will find a XML file that contains content like this: > ... >Hector ... >Notice that end tags do NOT contain the element name. >From my reading of the XML spec this file is not well formed. You are correct >But is it acceptable? No. Microsoft will doubtless fix this. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From David.Brownell at Eng.Sun.COM Mon Apr 20 21:24:12 1998 From: David.Brownell at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:00:37 2004 Subject: SAX: Distributed Implementations Message-ID: <199804201922.MAA04377@argon.eng.sun.com> > > I'd probably not split a real application in that particular way, > > though. The latency penalty for lots of fine grained syntax > > callbacks hurts, and distributed systems are generally designed to > > ship bulk data (such as an XML message) and process it locally > > (such as parse, interpret, respond to some purchase order in XML > > while updating several databases). HTTP is only one of the more > > visible examples of that trend. > > This wouldn't be too much of a problem with a remote character or byte > stream, especially since we've removed the single-character and > single-byte read(). AElfred, for example, slurps up 32K at a time > into its read buffer. Sure -- but with each slurp of 32Kb, it can be doing thousands of syntax callbacks. That easily adds up to seconds of overhead, even assuming an idle network (no contention). A few years ago I used 200 calls per second as a standard OO RPC speed estimate; it can be faster, but it can be slower too. And "faster" is not an order of magnitude faster. Those callbacks were the worrisome part of your scenario ... :-) > By the way, Java is simply the initial implementation for SAX, but it > is not intended to be the only one. I understand this. But since the master spec isn't in something like IDL, then you're already committing to language-specific translations and customizations ... how do you decide which things should be custom, which shouldn't be? You're not making the tradeoffs I'm used to seeing when people design systems to use in multiple languages. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Mon Apr 20 21:37:08 1998 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 17:00:37 2004 Subject: Abbreviated end tags found in a XML file Message-ID: Bryan Gilbert found these: Hector Berlioz 1803 1869 This is definitely _not_ acceptable. I suspect the option for allowing this notation still exists in MSXML, but it isn't XML. The complete discussion of the MSXML is available at http://www.lists.ic.ac.uk/hypermail/xml-dev/9711/index.html. When this topic last arose, Chris Lovett seemed to have promised the end of these misformed XML files with: >Woops - this as an oversight. Turns out this XML is generated dynamically >via JavaScript and I forgot to update this script. This will be fixed >tonight. There must be a few still floating around. Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simpson at polaris.net Mon Apr 20 21:40:47 1998 From: simpson at polaris.net (John E. Simpson) Date: Mon Jun 7 17:00:37 2004 Subject: Abbreviated end tags found in a XML file Message-ID: <3.0.32.19980420153923.0069e108@polaris.net> At 12:01 PM 4/20/98 -0700, Bryan Gilbert wrote: >At this site >http://www.microsoft.com/msdn/sdk/inetsdk/samples/databind/composer.xml >you will find a XML file that contains content like this: >Hector >Notice that end tags do NOT contain the element name. >>From my reading of the XML spec this file is not well formed. > >But is it acceptable? If it's not at least well-formed then no, it's not acceptable. :) I believe those examples resulted from an abortive experiment on Microsoft's part to super-minimize the closing tags, but they have since come back to the One True path. The end tag of the tag should therefore be . The example (and others like it) are about 3-4 months out of date. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pelegri at Eng.Sun.COM Mon Apr 20 21:59:52 1998 From: pelegri at Eng.Sun.COM (Eduardo Pelegri-Llopart) Date: Mon Jun 7 17:00:37 2004 Subject: SAX: Distributed Implementations In-Reply-To: <199804201849.OAA05504@unready.microstar.com> References: <199804201849.OAA05504@unready.microstar.com> Message-ID: <199804201958.MAA12169@calterra.eng.sun.com> I agree with Dave Brownell about the distribution issues. The only advantage I see for having a language-neutral SAX is having a parser, say written in C, called from a client written, say in Java, all in the same machine (and probably the same process). There are not that many pairs of interest (to me, I'm a Java programmer), so I'd *much rather* put the burden on the implementor of the parser, or of the translation layer, than on the client. I.e. I'd *much rather* have a SAX/Java that mix best with Java than a language-neutral SAX. Of course, it depends on what your goals are. - eduardo xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Mon Apr 20 22:26:27 1998 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:00:38 2004 Subject: Abbreviated end tags found in a XML file Message-ID: <5BF896CAFE8DD111812400805F1991F701C9104F@red-msg-08.dns.microsoft.com> And while we are at it, can we forbid "COMPSR"? :-) > -----Original Message----- > From: Simon St.Laurent [SMTP:SimonStL@classic.msn.com] > Sent: Monday, April 20, 1998 12:37 PM > To: Bryan Gilbert; Xml-Dev (E-mail) > Subject: RE: Abbreviated end tags found in a XML file > > Bryan Gilbert found these: > > Hector > Berlioz > 1803 > 1869 > > This is definitely _not_ acceptable. I suspect the option for allowing > this > notation still exists in MSXML, but it isn't XML. > > The complete discussion of the MSXML is available at > http://www.lists.ic.ac.uk/hypermail/xml-dev/9711/index.html. > > When this topic last arose, Chris Lovett seemed to have promised the end > of > these misformed XML files with: > > >Woops - this as an oversight. Turns out this XML is generated dynamically > >via JavaScript and I forgot to update this script. This will be fixed > >tonight. > > There must be a few still floating around. > > Simon St.Laurent > Dynamic HTML: A Primer / XML: A Primer / Cookies > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Tue Apr 21 03:56:48 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:38 2004 Subject: Inheritance in XML References: <01bd6c80$56217990$020b0ac0@xerius> Message-ID: <353BFCE5.D501C4AB@technologist.com> Matthew Gertner wrote: > > Paul, > > Let me try to explain at least what I am envisioning as far as the putative > inheritance mechanism which I described is concerned. I am going to get > myself into trouble by saying this, but SGML was an attempt to avoid doing > 5% of what is necessary (i.e. to do everything), and this led 10 years down > the line to the creation of XML. XML is supposed to do the most common 80% of what SGML does, not 5%, fairly randomly chosen. > A very uncharitable view (which I would certainly not endorse :-) would be > to say that you are setting the bar very high, and then rejecting ideas > which at least one person in this list thinks would be of great value > because they don't reach the level of functionality that you are imposing. In my mind, I'm setting the bar at "solving a significant proportion of problems of the same type." Adding 5% solutions incrementally is *exactly* how we get back to where SGML is today. Many of SGML's less used features seemed like they would be really useful, but turned out not to be general enough to solve the problems people thought that they would solve. XML should be the resting place of features that are well understood and that everybody uses or that it is obvious that everyone *will* use, because they solve so many problems at once that they can't fail to be useful. > It seems to me that your objection to this boils down to the fact that you > also want to subtype without inheritance. And occasionally inherit without subtyping. > It's a hard, hard problem. On the other hand, C++ has gotten away with > providing subtyping only through inheritance (you mentioned that it can but > I can't figure out how - please enlighten), and it's still a pretty useful > little language. C++ provides subtyping without inheritance through abstract base classes and pure virtual methods. I prefer Java's syntactically distinct interfaces feature, but the C++ way works. > What I was implying in my example about CML were 3 things: 1) The processor > has access to the base DTD. 2) The processor has access to the derived DTD. > 3) The processor knows about the inheritance (which is also a subtyping :-) > mechanism being used. This would enable it to get at the content of the base > element type without knowing what to do with the content of the derived > element type. I wouldn't quite say that. The processor must know what to do with the new nodes. It must either know to ignore the "extra" content of the derived element type, or it must know to ignore the tags and process the content, or halt and catch fire, or do something else. You are arguing that it should always use the "ignore content" model, but we know from HTML that this is often not appropriate. > All I want is to be > able to do is scoot over to the DTD repository site, check for a standard > DTD for invoices, grab it, extend it with the two or three extra attributes > and/or contained element types that I need and use it, while still being > able to use any tools that are designed to work with the original invoice > DTD. I truly believe that this is where XML will really start to fulfill its > promise. My assertion is that even in a document as simple as an invoice you are going to come up against the limitations of this feature. The base DTD will have a content model like: (PARTNAME,PRICE)+ and you will need to change that to: (DATE,DESCRIPTION,HOURS,AMOUNT)+ because that is how you do invoicing at your company. I assert that this sort of case is much more common than the simple case where all you want to do is prefix or suffix an element and have that content be ignored. This is especially the case when we move away from invoices and into more flexible document types. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Journalism is good if you follow the rules. Don't allow the human rights groups to spoil your profession" - Col. Godwin Ugbo of the Nigerian military dictatorship xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pierlou at CAM.ORG Tue Apr 21 07:36:27 1998 From: pierlou at CAM.ORG (Pierre Morel) Date: Mon Jun 7 17:00:38 2004 Subject: ANN: XML/DTD Editor Message-ID: <01bd6ce6$2f303bd0$01dcdcdc@pc-010> A Java XML/DTD editor is available for download at: http://www.pierlou.com/visxml Create DTD using a user interface. Build XML file who respect DTD structure. View DTD source and XML source while editing. Sort notations, entities, elements or attributes. Cut, copy, paste, delete and find items. Internationalization support. ( English and French version available ) Visual XML use XML file for the definition of is own screen. Use IBM Java Installer 1.0 for the installation. Pierre Morel xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthew at praxis.cz Tue Apr 21 13:40:05 1998 From: matthew at praxis.cz (Matthew Gertner) Date: Mon Jun 7 17:00:38 2004 Subject: Inheritance in XML Message-ID: <01bd6cf9$d5a02260$020b0ac0@xerius> Paul, Thanks for your reply. I guess by now we have both thoroughly digested each other's point of view. I'll just make a couple of additional comments: >XML is supposed to do the most common 80% of what SGML does, not 5%, >fairly randomly chosen. 5% was your figure... >In my mind, I'm setting the bar at "solving a significant proportion of >problems of the same type." Adding 5% solutions incrementally is >*exactly* how we get back to where SGML is today. Many of SGML's less >used features seemed like they would be really useful, but turned out >not to be general enough to solve the problems people thought that they >would solve. XML should be the resting place of features that are well >understood and that everybody uses or that it is obvious that everyone >*will* use, because they solve so many problems at once that they can't >fail to be useful. This is fair enough. In the end it all boils down to how useful you think a given mechanism will be, since hindsight will only be available down the road. So we move from a technical to a quasi-religious type of discussion. >C++ provides subtyping without inheritance through abstract base classes >and pure virtual methods. I prefer Java's syntactically distinct >interfaces feature, but the C++ way works. ABCs can have public class members... Anyway, the Java approach is certainly better. >I wouldn't quite say that. The processor must know what to do with the >new nodes. It must either know to ignore the "extra" content of the >derived element type, or it must know to ignore the tags and process the >content, or halt and catch fire, or do something else. You are arguing >that it should always use the "ignore content" model, but we know from >HTML that this is often not appropriate. This is where the whole discussion started. In OO programming, you have to design your derived classes so as not to break the base class's behavior. >My assertion is that even in a document as simple as an invoice you are >going to come up against the limitations of this feature. The base DTD >will have a content model like: > >(PARTNAME,PRICE)+ > >and you will need to change that to: > >(DATE,DESCRIPTION,HOURS,AMOUNT)+ > >because that is how you do invoicing at your company. > >I assert that this sort of case is much more common than the simple case >where all you want to do is prefix or suffix an element and have that >content be ignored. This is especially the case when we move away from >invoices and into more flexible document types. Okay, I don't agree at all with this assertion, but as I said earlier, it all boils down to a judgement call. Cheers, Matthew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From m.mower at unl.ac.uk Tue Apr 21 16:21:31 1998 From: m.mower at unl.ac.uk (Matt Mower) Date: Mon Jun 7 17:00:38 2004 Subject: Parser for DTDs Message-ID: <35419ab5.75187578@tara.unl.ac.uk> Hi. I am looking for a java parser that when fed a DTD returns a model representing all the valid documents that can be expressed using that DTD, i.e. for any given element type I need to be able to determine what are valid sub-elements/attributes/content. One additional constraint is that I need to parse the DTD outside of the context of an XML document. I would ideally like to be able to write code something like :- >DTDParser dp = new DTDParser( dtdInputStream ); >if( dp.isValid() == true ) { > Element[] childElems = dp.getRoot().getChildElements(); > Attribute[] attrs = childElems[0].getAttributes(); > > if( childElems[0].isChildElement( "someelement" ) == true ) { > ... > } >} I believe Larval may have this kind of capability (i'm not sure yet), are there others? Best regards. Matt/ -- Matt Mower, Information Systems Team, University of North London T: +44-(0)171-753-3288 F: +44-(0)171-753-5120 E: m.mower@unl.ac.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From deke at tallent.com Tue Apr 21 16:23:29 1998 From: deke at tallent.com (Deke Smith) Date: Mon Jun 7 17:00:38 2004 Subject: Abbreviated end tags found in a XML file Message-ID: <1318971957-137703662@tallent.com> Tim Bray, tbray@textuality.com said on 4/20/98 2:15 PM: >>But is it acceptable? > >No. > >Microsoft will doubtless fix this. -Tim Or declare it the standard... ----------------------------------------------------------------- Deke Smith Tallent Communications Group, Brentwood TN deke@tallent.com, 615-661-9878 ----------------------------------------------------------------- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Joerg.Brunsmann at FernUni-Hagen.de Tue Apr 21 17:45:50 1998 From: Joerg.Brunsmann at FernUni-Hagen.de (Joerg Brunsmann) Date: Mon Jun 7 17:00:39 2004 Subject: SAX, expat and JNI Message-ID: <353CBF06.4839@fernuni-hagen.de> Hi, assuming that interpreted Java byte code is still slower than compiled C code, it is desirable to use James' expat XML Parser in conjunction with SAX to gain maximum performance while parsing XML documents. Therefore I suggest to use the Java native method interface (JNI) to invoke expat. How can this be achieved? First of all, we need a shared library (on Windows a DLL) which contains the expat code and in addition to that these handlers: ---------------------- start of C code ---------------------------- static void characterData(void *userData, const char *s, int len) { callback into Java VM here for expatHandler.doCharacterData } static void startElement(void *userData, const char *name, const char **atts) { callback into Java VM here for expatHandler.doStartElement } static void endElement(void *userData, const char *name) { callback into Java VM here for expatHandler.doEndElement } static void processingInstruction(void *userData, const char *target, const char * data) { callback into Java VM here for expatHandler.doProcessingInstruction } void initParser() { XML_Parser parser = XML_ParserCreate(encoding); XML_SetElementHandler(parser, startElement, endElement); XML_SetCharacterDataHandler(parser, characterData); XML_SetProcessingInstructionHandler(parser, processingInstruction); } void doParse() { XML_Parse(parser, data, size, 1)); } ---------------------- end of c code ----------------------------- We now can define the Java class which interfaces to the expat parser contained in a shared libray. This java code look like this: ----------------- start of expatHandler -------------------------- public class expatHandler { void doCharacterData(String s, int len) { documentHandler.characters(s); } void doStartElement(String name, String[] atts) { documentHandler.startElement(s,convertAttsToAttributeMap(atts)); } void doEndElement(String name) { documentHandler.endElement(name); } void doProcessingInstruction(String target, String data) { documentHandler.processingInstruction (target,data); } native public void doParse(); native public void initParser(); static { loadLibrary("expat"); } } ----------------- end of expatHandler ------------------------- It is then straightforward to extend this expatHandler to declare the SAX driver for expat: ----------------- start of expat driver ------------------------ package com.microstar.sax; /** * A SAX driver for James Clark's expat XML parser */ public class expatDriver extends expatHandler implements org.xml.sax.Parser { public void setEntityHandler (EntityHandler handler); { this.entityHandler = handler; } public void setDocumentHandler (DocumentHandler handler) { this.documentHandler = handler; } public void setErrorHandler (ErrorHandler handler) { this.errorHandler = handler; } public void parse (String publicId, String systemId) throws java.lang.Exception { ... initParser(); documentHandler.startDocument(); doParse(); ... documentHandler.endDocument(); ... } } ----------------- end of expat driver ----------------------- Does this make sense? Comments? Who will volunteer? ;-) Cheers, Joerg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From francis at redrice.com Tue Apr 21 18:05:33 1998 From: francis at redrice.com (francis) Date: Mon Jun 7 17:00:39 2004 Subject: Parser for DTDs References: <35419ab5.75187578@tara.unl.ac.uk> Message-ID: <353CC284.AAC9B682@redrice.com> Hi, Matt Mower wrote: > > Hi. > > I am looking for a java parser that when fed a DTD returns a model > representing all the valid documents that can be expressed using that > DTD, i.e. for any given element type I need to be able to determine what > are valid sub-elements/attributes/content. One additional constraint is > that I need to parse the DTD outside of the context of an XML document. > ... I can recommend Aelfred from microstar.com - it gives your program callbacks which get called for DTD events like new element and tag definitions as well as the expected XML events. There is a nice simple DtdDemo.java program which you can hack to slap things into a Swing JTree. Aelfred parses the DTD from the XML document however - if you need to parse the DTD directly then the IBM parser certainly mentions this feature in its documentation, the Microsoft one also mentions it as available via a slightly non-recommended direct access to the the parser object. I haven't used either yet... http://www.alphaWorks.ibm.com/formula/ http://www.microsoft.com/workshop/author/xml/parser/ http://www.microstar.com/xml/ Hmmm - just cut and pasted IBM's command line DTD demo from the codumentation into a DOS box jre -cp "xml4j.jar" GeneratingSample personal.dtd and it worked straight off - nice... Francis. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Tue Apr 21 20:18:37 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:39 2004 Subject: SAX, expat and JNI In-Reply-To: <353CBF06.4839@fernuni-hagen.de> References: <353CBF06.4839@fernuni-hagen.de> Message-ID: <199804211816.OAA00258@unready.microstar.com> Joerg Brunsmann writes: > assuming that interpreted Java byte code is still slower than > compiled C code, it is desirable to use James' expat XML Parser in > conjunction with SAX to gain maximum performance while parsing XML > documents. Surprisingly, this is an assumption that will not always hold true. On a P166 NT box using Microsoft's jview, for example, AElfred can chew through about 1MB of XML each second. I have not tested XP on an NT box, but since it is about 10-20% faster than AElfred on my Linux notebook, it could be even faster on the NT box as well. In other words, unless you're dealing with very large XML documents (say, 10MB or more), you'll probably lose more time loading the DLL and converting data in the stubs than you'll gain in having a slightly faster parser. My experience in the past has been that at least 90% of your application's time will be spent processing the information delivered by the parser; in fact, you would probably do better to leave the XML parser in Java and implement the event handlers in C. That said, SAX is non-language-specific precisely so that people can do the sort of thing that you propose, and I'll be very interested in following your progress. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Tue Apr 21 23:02:43 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:39 2004 Subject: Parser for DTDs Message-ID: <3.0.32.19980421135833.006d8c9c@pop.intergate.bc.ca> At 02:06 PM 21/04/98 +0000, Matt Mower wrote: >I am looking for a java parser that when fed a DTD returns a model >representing all the valid documents that can be expressed using that >DTD IBM's XML for Java is designed to support something like what you describe. >I believe Larval may have this kind of capability (i'm not sure yet), >are there others? Larval doesn't have it built-in, but it parses content models into a fairly straightforward set of data structures, and you could certainly generate this kind of thing by writing code that runs around them. But it would be work. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From norbert at datachannel.com Wed Apr 22 01:11:00 1998 From: norbert at datachannel.com (Norbert Mikula) Date: Mon Jun 7 17:00:39 2004 Subject: Some DataChannel/XML news Message-ID: <06f801bd6d7a$9742ff70$bd0a1bac@norbert.datachannel.com> Hi, I would like to inform you about a few new XML pieces that DataChannel has put up/updated on its webserver. Here is the news in condensed form : * DataChannel WebBroker Distributed Computing over the Web (XML and http based) http://www.datachannel.com/developers/webbroker/index.html * DataChannel Save to the Web (TM) functionality http://www.datachannel.com/rio/ds_entry.html * DataChannel XML Parser DOM support is finally here http://www.datachannel.com/products/xdk/DXP/index.html * DataChannel DOM Builder A showcase for using the DOM and DXP http://www.datachannel.com/products/xdk/DXP/dom_builder.html * DataChannel XML Library Check out our XML related whitepapers http://www.datachannel.com/developers/xml_lib.html * DataChannel XML Development Kit Some of the above, plus much more http://www.datachannel.com/products/xdk/index.html ---------------------------------------------- Norbert H. Mikula Sr. Online Information Architect Norbert@DataChannel.com DataChannel, 155 108th Avenue NE Ste 400, Bellevue, WA 98004 Phone: 425.462.1999 Fax: 425.637.1192 http://www.datachannel.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19980422/81aea2b0/attachment.htm From donpark at quake.net Wed Apr 22 02:42:39 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:00:39 2004 Subject: 1998-04-20 Pre-Release, with Road Map and Demos Message-ID: <004401bd6d86$94d38fa0$2ee044c6@donpark> David, SAX, as it currently stands, does not return XML comments. This is just fine for one way processing of XML documents. However, I am having a problem with SAXDOM because of this seemingly irrelavant information loss. I will soon be adding DOM output classes to SAXDOM but documents edited with SAXDOM will lose all comments. I think this is a significant weakness of SAX. Regards, Don Park http://www.docuverse.com/personal/index.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tony at ems.uq.edu.au Wed Apr 22 06:39:52 1998 From: tony at ems.uq.edu.au (Anthony B. Coates) Date: Mon Jun 7 17:00:40 2004 Subject: Literate Programming and XML Message-ID: <199804220433.OAA19366@bindy.ems.uq.edu.au> Recently, I was asked to put together a seminar on literate programming and XML. For those of you who may not be familiar with literate programming, it is a system that turns code documentation on its head. Instead of 90% code with 10% comments (if you are lucky), literate programming tools allow code chunks to be spread throughout a descriptive document, with the code appearing in the order that best suits the description, rather than the compiler. The literate programming tools then take this combined document and "tangle" it (create the source files) and "weave" it (create the documentation files). In the past, literate programming documents have used various custom markup syntaxes, and produced output most commonly in TeX or LaTeX (Donald Knuth invented both TeX and literate programming, hence the connection). It strikes me that XML would be a better solution, both in terms of unifying the disparate document syntaxes, and in avoiding the replication of effort in creating parsers for each different format. Equally, XSL, XLink, and XPtr all address important issues that any literate programming tool author normally has to solve in code her/himself, so there is the chance to make literate programming tools into lighter applications that sit on top of solid standards. If you are interested, the seminar is available via the Web at I would like to try and organise a group of literate programming and XML people to see if we can't work something out. The success of this list in creating the SAX specification makes me believe that the same process could make literate programming accessible and useful to a much broader audience. If you are interested, please e-mail me: Cheers, Tony. -- Educational Multimedia Services = reduced workloads for lecturers, teachers, and tutors = better results for students. -- Another 100% Pure Java e-mail. Is yours? -- Anthony B. Coates. Multimedia Developer (Software Design) Educational Multimedia Services TEDI, The University of Queensland. AJUG-QLD Steering Committee Member (Australian Java Users' Group ). QMUG Committee Member (Queensland Macromedia Users' Group ). E-mail: tony@ems.uq.edu.au WWW: http://www.ems.uq.edu.au/People/Tony/ UIN: 5191015 -- All opinions are mine, and may not represent those of The University of Queensland. -- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Apr 22 11:56:22 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:40 2004 Subject: name case-folding In-Reply-To: <3.0.32.19980419142228.00b10590@pop.intergate.bc.ca> Message-ID: <3.0.1.16.19980421220254.454f575a@pop3.demon.co.uk> At 14:22 19/04/98 -0700, Tim Bray wrote: >At 08:42 PM 4/19/98 +0100, Steve Robertson wrote: >>If I remember correctly, the XML specification states that the processor >>should fold names to uppercase characters for the purpose of attribute and >>element name comparisons. >> >>Do I have this right, or is the case of name characters significant? > >You do not. With a few rare exceptions (e.g. language tags) all >names in XML are case-sensitive. My annotated spec >(http://www.xml.com/axml/axml.html) has some verbiage as to why >this is so, for those around here fortunate enough not to have lived >through that debate. -T. However, there is a potential for case-insensitive comparison in the use of Xpointers (DRAFT: WD-xptr.html, section 3.3.4) where case can be folded under certain circumstances. Section A.1 suggests that this issue is not completely finalised. Thus, whilst the attribute values in an XML document are case-sensitive, it is envisaged (**at this stage of the DRAFT**) that people using Xpointers might legitimately search for case-insensitive values. I suspect this may cause occasional confusion and probably needs highlighting. [I have had my chance to comment on this :-)]. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Wed Apr 22 13:44:50 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:40 2004 Subject: 1998-04-20 Pre-Release, with Road Map and Demos In-Reply-To: <004401bd6d86$94d38fa0$2ee044c6@donpark> References: <004401bd6d86$94d38fa0$2ee044c6@donpark> Message-ID: <199804221142.HAA00458@unready.microstar.com> Don Park writes: > SAX, as it currently stands, does not return XML comments. This is > just fine for one way processing of XML documents. However, I am > having a problem with SAXDOM because of this seemingly irrelavant > information loss. > > I will soon be adding DOM output classes to SAXDOM but documents > edited with SAXDOM will lose all comments. I think this is a > significant weakness of SAX. Comments and CDATA sections will certainly be near the top of the list for a level-two SAX, if people decide that we need such a beastie. In the mean time, please remember that the DOM models everything that _can_ be represented, not everything that _must_ be represented. There will probably end up being a core set of information that every DOM builder must provide, but I'd expect that much of it will be optional. A DOM builder for a simple processing tool won't want to waste nodes for information that it doesn't need (such as internal entity boundaries). I think that the chair of the DOM WG, Lauren Wood, reads this list; perhaps she can comment or correct any mistake that I might have made here. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Wed Apr 22 13:56:49 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:40 2004 Subject: Literate Programming and XML References: <199804220433.OAA19366@bindy.ems.uq.edu.au> Message-ID: <353DDB0B.2710D91C@technologist.com> I just wanted to note that people have been doing literate programming in SGML for quite a while. You should probably try to build on the existing knowledge rather than starting from scratch: http://www.sil.org/sgml/search?kw=literate+programming&fh=1&mh=50 Michael Sperberg-McQueen, has been thinking about this since at least 1993: http://www.sil.org/sgml/sgml93.html Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jbolles at homeaccount.com Wed Apr 22 18:44:29 1998 From: jbolles at homeaccount.com (Jack Bolles) Date: Mon Jun 7 17:00:40 2004 Subject: XML Editors Message-ID: <353E1F70.92C39520@homeaccount.com> Can anyone recommend an html editor that is XML-savvy, WYSIWYG, easy to use and robust. We are beginning to develop server driven html pages and I hope to change that to XML pages once an OFX flavor of XML exists. Not having to change tools would be a boon. WYSIWYG because I have to do a lot of prototyping. Regards, Jack xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Wed Apr 22 21:12:56 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:00:40 2004 Subject: Nesting XML based languages and scripting languages In-Reply-To: (message from Michael Amster on Fri, 17 Apr 1998 19:30:03 -0700) Message-ID: <199804221908.MAA19130@boethius.eng.sun.com> [Michael Amster:] | We have looked at XSL and think it is too limited to HTML as an output It would be a big mistake to base strategy on the original XSL submission. You would be wise to wait for the first working draft in July before making decisions. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Apr 22 23:03:12 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:40 2004 Subject: 1998-04-20 Pre-Release, with Road Map and Demos Message-ID: <3.0.32.19980422044237.006e1610@pop.intergate.bc.ca> At 05:35 PM 21/04/98 -0700, Don Park wrote: >I will soon be adding DOM output classes to SAXDOM but documents edited with >SAXDOM will lose all comments. I think this is a significant weakness of >SAX. No, it's a significant advantage. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Wed Apr 22 23:17:35 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:40 2004 Subject: XML Editors In-Reply-To: <353E1F70.92C39520@homeaccount.com> References: <353E1F70.92C39520@homeaccount.com> Message-ID: <199804222115.RAA00251@unready.microstar.com> Jack Bolles writes: > Can anyone recommend an html editor that is XML-savvy, WYSIWYG, > easy to use and robust. We are beginning to develop server driven > html pages and I hope to change that to XML pages once an OFX > flavor of XML exists. Not having to change tools would be a > boon. WYSIWYG because I have to do a lot of prototyping. Basically, you need an editor that can handle both SGML reference concrete syntax (HTML is an SGML application) and the XML variant of SGML syntax. Lennart Staflin's PSGML will give you this for free, but it is not WYSIWYG. If you're prototyping for customers, you can always set up a batch file to create a rendered HTML version of the XML on the fly using XSL-J and Jade. If you take a bit of time to learn e-lisp, you can customise Emacs to do all of this from a drop-down menu (you can even call it "Print Preview" if you want). All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mamster at webeasy.com Thu Apr 23 04:21:14 1998 From: mamster at webeasy.com (Michael Amster) Date: Mon Jun 7 17:00:40 2004 Subject: Nesting XML based languages and scripting languages In-Reply-To: <199804221908.MAA19130@boethius.eng.sun.com> References: Message-ID: Jon; We have posted to the group explaining that we would like to have a generalized embedded language handler for XML languages interspersed with our own XML based language. The current approach in XSL is to include the formatting command set from HTML in the XSL DTD in order to output HTML. This does not scale to arbitrary embedded languages (OFX, CFD, RFD). The only solution we currently see is to be well formed and run without a DTD. We would really like to have a DTD so that we can take advantage of validation and authoring tools. I don't see from the current XSL approach how you will overcome this fundamental problem. Maybe my newness to XML prevents me from seeing how this may be solved via a DTD mechanism like architectural forms, but I'm still reading David's book. -MA At 12:08 PM 4/22/98 -0700, you wrote: >[Michael Amster:] > >| We have looked at XSL and think it is too limited to HTML as an output > >It would be a big mistake to base strategy on the original XSL >submission. You would be wise to wait for the first working draft in >July before making decisions. > >Jon > ~-~-~-~-~-~-~-~-~-~-~-~-~-~-WEBEASY-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~ Michael Amster mamster@webeasy.com 4676 Admiralty Way, Suite 300 Tel: 310.576.0770 Marina Del Rey, CA 90292 Fax: 310.576.2011 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Thu Apr 23 05:40:55 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:00:40 2004 Subject: Nesting XML based languages and scripting languages Message-ID: <3.0.32.19980422223744.007207d0@postoffice.swbell.net> At 07:20 PM 4/22/98 -0700, Michael Amster wrote: >This does not scale to arbitrary embedded languages (OFX, CFD, RFD). The >only solution we currently see is to be well formed and run without a DTD. >We would really like to have a DTD so that we can take advantage of >validation and authoring tools. SGML architectures solve the validation problem. For an example of an ideosyncratic document that intermixes my own element types with those defined by the DSSSL standard as well as DSSSL-syntax functions, see "http://www.isogen.com/papers/litprogarch/litprogarch.html", _An Approach to Literate Programming With SGML Architectures_. Within the next week or so, ISOGEN will be releasing a modification of SP (and therefore all the SP-based tools) that recognizes the proposed PI form of architecture use declaration. This makes it trivially easy to do complete architecture-based processing of XML documents using any SP-based tool (including Jade, which we are also enhancing to enable construction of architectural groves with the sgml-parse function). Cheers, E. --
W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Thu Apr 23 06:31:03 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:00:41 2004 Subject: Inheritance in XML In-Reply-To: <01bd6c3a$c50f2910$020b0ac0@xerius> (matthew@praxis.cz) Message-ID: <199804230428.VAA19547@boethius.eng.sun.com> I'm generally not able to track discussions like this, fascinating though they may be, and I make it a firm principle not to become involved in them, so don't expect any further comments from me regarding this one. But catching up on my email backlog just now I see so much good energy being wasted that I can't pass by without contributing a couple of items of information that may save some wheel-spinning out there. First, allow me to vent just a little bit about a common misunderstanding. [Matthew Gertner:] | In last month's Wired, XML made it into the "hype list" with the | comment that we crazy XML types are kidding ourselves because XML will | never fly without well-defined semantics. This gets the "No Shit, Sherlock" award for excellence in trade press reporting. XML was very carefully designed to have no built-in semantics whatsoever. So considered in isolation, an XML document is found to have... no semantics! What an insight! And we can go further: to give semantics to this thing that was designed to have no semantics we have to have... it's coming to me, wait a minute... yes! We have to supply something else that *does* provide the semantics! Wow! Pulitzer prize time for sure. Here are some examples of things that can provide semantics for XML documents: * Scripts or programs. Especially Java programs. :-) * Prose descriptions (if you said "DTDs" you are confused, but understandably so; a lot of good people have been confused about this before you). The namespace specification provides a standard way to associate prose descriptions and other bearers of semantic information with classes of XML documents. * Stylesheeets. Especially XSL stylesheets, which are even as we speak being defined by a very active W3C XSL WG. This is why you will want to look carefully at the first XSL working draft expected out in July, because XSL will provide what is intended to be the most powerful standardized high-level way to associate presentational semantics with XML documents in publishing environments. Watch this space: http://www.w3.org/Style/XSL So people who think that there is something missing from XML are by and large simply unaware that it was not intended to be used by itself and that the other pieces are on their way. (There's XLink, too.) This has all been made abundantly clear in every W3C statement about the XML activity for the last year and a half, but it's to be expected that a lot of folks just won't bother to pay attention to stuff like that. Now let's turn to the chief concern of this thread. After a number of excellent observations about the need for a schema language for XML documents and the considerations that have to go into the specification of such a thing, Matthew asks the following question: | More tricky than any of these technical issues is the question of | what, if anything, could be done to promote a mechanism of this | sort. Obviously this would require a change to the XML spec as | well as modification to all existing tools which process DTDs, so | it's a pretty big deal. I wonder if anyone besides me thinks that | a simple mechanism like this would make sense. If so, is there | any room in the XML standards process to discuss a change of this | type at some point in the future (certainly not for XML 1.0)? The answer is, Yes, there are other people who think that it would make sense to design an XML schema mechanism to handle issues like what has been called "inheritance" in this discussion (not to mention good old-fashioned data typing). The workings of a W3C committee can be made public only at the discretion of the chair of the committee, so I will put on my official XML WG Chairman hat and reveal unto ye that the XML WG has officially requested that the job of defining a schema language for XML documents be added to its charter. If approved by the W3C Director, this work would certainly involve a consideration of most of the issues raised in this discussion and would include a close look not only at XML Data but also at other proposed solutions to the same problem. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu Apr 23 09:24:20 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:41 2004 Subject: 1998-04-20 Pre-Release, with Road Map and Demos In-Reply-To: <004401bd6d86$94d38fa0$2ee044c6@donpark> Message-ID: <3.0.1.16.19980422191124.3717562e@pop3.demon.co.uk> At 17:35 21/04/98 -0700, Don Park wrote: >David, > >SAX, as it currently stands, does not return XML comments. This is just >fine for one way processing of XML documents. However, I am having a >problem with SAXDOM because of this seemingly irrelavant information loss. > >I will soon be adding DOM output classes to SAXDOM but documents edited with >SAXDOM will lose all comments. I think this is a significant weakness of >SAX. Don, I sympathise with anyone who worries about information loss, but it's important to remember the roots and rationale of SAX. We pushed for SAX as a *simple* tool to read the (mainly mandatory) output of XML parsers. We were becoming overwhelmed with variants in the output interface. We had to draw the line somewhere and comments and CDATA were among those below the cut. There is already a feeling that SAX is becoming too large. I've held my peace on this because I was not familiar with the major recent issues - exceptions and character handling and both of those have covered important ground (both are generic problems with value elsewhere in the X*L process). P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Thu Apr 23 09:37:05 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:00:41 2004 Subject: 1998-04-20 Pre-Release, with Road Map and Demos Message-ID: <014b01bd6e89$a52badc0$2ee044c6@donpark> >I think that the chair of the DOM WG, Lauren Wood, reads this list; >perhaps she can comment or correct any mistake that I might have made >here. I sure hope Lauren is reading this list cuz latest DOM spec is pretty hard to swallow. NodeIterator, for example, has been changed so that current position lies between two nodes. The concept is neat theoretically but very hard to implement efficiently in Java. Requiring NodeIterator to iterate over 'live' data is also making it very difficult to implement. In the face of these problems, mixing iteration with indexing is not too strange. The latest DOM spec also retrieves attributes via NodeIterator which makes it a requirement that Attribute implementations also implement Node interface. Overall feeling I get about the latest DOM spec is not too good despite little improvements like attribute values being strings now. Hopelessly involved with a Mystic API, Don Park http://www.docuverse.com/personal/index.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthew at praxis.cz Thu Apr 23 09:45:34 1998 From: matthew at praxis.cz (Matthew Gertner) Date: Mon Jun 7 17:00:41 2004 Subject: Inheritance in XML Message-ID: <01bd6e84$30542040$020b0ac0@xerius> Jon, Thanks for the info. This is interesting stuff indeed. BTW: I hope you are wrong about good energy being wasted. I at least feel I learned a lot from this discussion. Cheers, Matthew -----Original Message----- From: Jon Bosak To: xml-dev@ic.ac.uk Date: Thursday, April 23, 1998 6:31 AM Subject: Re: Inheritance in XML >I'm generally not able to track discussions like this, fascinating >though they may be, and I make it a firm principle not to become >involved in them, so don't expect any further comments from me >regarding this one. But catching up on my email backlog just now I >see so much good energy being wasted that I can't pass by without >contributing a couple of items of information that may save some >wheel-spinning out there. > >First, allow me to vent just a little bit about a common >misunderstanding. > >[Matthew Gertner:] > >| In last month's Wired, XML made it into the "hype list" with the >| comment that we crazy XML types are kidding ourselves because XML will >| never fly without well-defined semantics. > >This gets the "No Shit, Sherlock" award for excellence in trade press >reporting. XML was very carefully designed to have no built-in >semantics whatsoever. So considered in isolation, an XML document is >found to have... no semantics! What an insight! > >And we can go further: to give semantics to this thing that was >designed to have no semantics we have to have... it's coming to me, >wait a minute... yes! We have to supply something else that *does* >provide the semantics! Wow! Pulitzer prize time for sure. > >Here are some examples of things that can provide semantics for XML >documents: > >* Scripts or programs. Especially Java programs. :-) > >* Prose descriptions (if you said "DTDs" you are confused, but >understandably so; a lot of good people have been confused about this >before you). The namespace specification provides a standard way to >associate prose descriptions and other bearers of semantic information >with classes of XML documents. > >* Stylesheeets. Especially XSL stylesheets, which are even as we >speak being defined by a very active W3C XSL WG. This is why you will >want to look carefully at the first XSL working draft expected out in >July, because XSL will provide what is intended to be the most >powerful standardized high-level way to associate presentational >semantics with XML documents in publishing environments. Watch this >space: > > http://www.w3.org/Style/XSL > >So people who think that there is something missing from XML are by >and large simply unaware that it was not intended to be used by itself >and that the other pieces are on their way. (There's XLink, too.) >This has all been made abundantly clear in every W3C statement about >the XML activity for the last year and a half, but it's to be expected >that a lot of folks just won't bother to pay attention to stuff like >that. > >Now let's turn to the chief concern of this thread. After a number of >excellent observations about the need for a schema language for XML >documents and the considerations that have to go into the >specification of such a thing, Matthew asks the following question: > >| More tricky than any of these technical issues is the question of >| what, if anything, could be done to promote a mechanism of this >| sort. Obviously this would require a change to the XML spec as >| well as modification to all existing tools which process DTDs, so >| it's a pretty big deal. I wonder if anyone besides me thinks that >| a simple mechanism like this would make sense. If so, is there >| any room in the XML standards process to discuss a change of this >| type at some point in the future (certainly not for XML 1.0)? > >The answer is, Yes, there are other people who think that it would >make sense to design an XML schema mechanism to handle issues like >what has been called "inheritance" in this discussion (not to mention >good old-fashioned data typing). The workings of a W3C committee can >be made public only at the discretion of the chair of the committee, >so I will put on my official XML WG Chairman hat and reveal unto ye >that the XML WG has officially requested that the job of defining a >schema language for XML documents be added to its charter. If >approved by the W3C Director, this work would certainly involve a >consideration of most of the issues raised in this discussion and >would include a close look not only at XML Data but also at other >proposed solutions to the same problem. > >Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu Apr 23 11:53:33 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:41 2004 Subject: Nesting XML based languages and scripting languages References: Message-ID: <353F0F97.360B109C@technologist.com> The big problem is that XML does not define part-wise validation. This is likely to come out of the schema work in the W3C. "The only solution we currently see is to be well formed and run without a DTD. We would really like to have a DTD so that we can take advantage of validation and authoring tools." As Eliot points out, you can also do this with architectural forms. I attribute this to the fact that they were designed in the 1990s instead of the 1980s (like the rest of SGML) and I further feel that part-wise validation should be part of any new schema language, including a revision of DTDs. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecom.mixx.de Thu Apr 23 15:49:48 1998 From: James.Anderson at mecom.mixx.de (james anderson) Date: Mon Jun 7 17:00:41 2004 Subject: Inheritance in XML [^*] References: <199804230428.VAA19547@boethius.eng.sun.com> Message-ID: <353F4751.1E76D25F@mecom.mixx.de> hello again. i read, with satisfaction, where, Jon Bosak wrote: (among other things, that) > I [hereby] put on my official XML WG Chairman hat and reveal unto ye > that the XML WG has officially requested that the job of defining a > schema language for XML documents be added to its charter. If > approved by the W3C Director, this work would certainly involve a > consideration of most of the issues raised in this discussion and > would include a close look not only at XML Data but also at other > proposed solutions to the same problem. that's nice to hear. much of the earlier parts of that posting, and most of this thread, are, on the other hand, disheartening. there are two issues: the claims regarding "semantics": the recommendation does, in deed, assert a semantic for xml documents. if efforts towards schemas are to succeed, it is essential to recognize what this semantic does provide and where its boundaries are, in order to determine what is to be added. the assertions (in the rest of the thread) regarding subtyping and inheritance": a distinction limited to "subtyping" and "inheritance" is incomplete and leads to inaccurate descriptions. A. to "semantics" the xml recommendation defines a two relations among elements (subsumption and precedence), defines a (yes) language for describing these relations (the dtd entities), asserts three states for documents (valid, invalid, and unspecified), and specifies how to infer which state a document is in based its content. this is a "semantic". i refrain from distinguishing "application" from "parsing" semantics, 'cause the distinction doesn't matter. the claim, that there is no semantic, diverts attention away from the sematic which is there, and from its limits. in particular, there are simultaneous efforts within w3c to standardize both a serialization form (xml) and a document model (dom). it is intended that there be a relation between the two. it is a mistake, that the standard for serialization claim no responsibility for establishing this "semantic". a claim made with all due respect and despite the quality of the work accomplished, because the lack of such a semantic will, in general, make it harder to use these products, and because the lack will also make progress impossible for the parties among the XML WG who would intend to address theselves to the issues alluded to in the quote above. the "inheritance" discussion is a case in point. it is disheartening to read where attention is deflected from the issue by claiming that no semantic was intended. B. to inheritance. there are three concepts: type, class, and inheritance. the first two depend on specific (albeit possibly abstract) models for operations and storage. the latter makes assertions about properties and relations within the former two domains. that is (loosely), (and (! property-x a "whatever") (! property-x b (inherits-from a))) => (? property-x b "whatever") => true where the property may concern type or it may concern class. that is, with respect to a given processing and storage model the respective properties may concern which operations may be be performed on the respective instances (read "subtyping") or they may concern what storage relationships the respective instances entail (read "subclassing"). these are two, in principle, independant properties, each of which may be inherited. although, for a given store and operation model these properties may overlap (see c++) it would be better to describe them as "subtyping" and "subclassing", rather than "subtyping" and "inheritance". if one insists on the latter, then one will not be able to describe everything one needs to. this comes to light, for instance in the assertion elsewhere in this thread, that architectural forms have nothing to do with inheritance. it is true, that they have only halfway to do with "class inheritance" - that is the storage structure implied by asserting that a given element conforms to one or more architectural forms. ( i think i claimed, in an earlier post, that they "punted" on this one. i still think that's true.) on the otherhand, they have everything to do with "type inheritance". that is, by virtue of asserting a relation between a concrete element and an archtectural element, certain operations which lead to a "valid" document state for the architectural element are produce the same state for the given concrete element. this is one form of inheritance. in order, however, for efforts in this direction to succeed (eg - issues like content interleaving, tag subsumption, whatever) it is essential that one recognize, that the concepts to be governed by inheritance have a meaning strictly with respect to particular processing and storage models. that given, if one claims - and is satisfied, that xml has by intention no sematic, and does not recognize that (without concern for an "application semantics") it is essential to provide at least a "dom semantics", then any discussions concerning serialized representations regarding assertions about types, classes, and inheritance (ie "schemas") will be ungrounded and bound to be futile. bye for now, xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pl at xmatrix.com Thu Apr 23 15:55:40 1998 From: pl at xmatrix.com (Philippe Lourier) Date: Mon Jun 7 17:00:41 2004 Subject: ANN: XML on DDJ TechNetcast w/ Tim Bray Message-ID: <3.0.5.32.19980423095251.007b95b0@mail.bettynet.com> XML will be the topic on this week's Dr. Dobb's TechNetcast broadcast, Tim Bray will be our guest. TechNetcast is interactive --you can ask questions during the live show by participating in the chat room or by calling in (212-965-1390). Advance questions can also be sent to ddj@technetcast.com Transcript and on-demand video stream are also available on the site after the event. Please visit the site for more info. Comments and suggestions welcome. Philippe Lourier Dr. Dobb's TechNetcast www.technetcast.com ddj@technetcast.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ray at guiworks.com Thu Apr 23 16:24:15 1998 From: ray at guiworks.com (Ray Cromwell) Date: Mon Jun 7 17:00:41 2004 Subject: SAX Level 2 (was 1998-04-20 Pre-Release...) References: <014b01bd6e89$a52badc0$2ee044c6@donpark> Message-ID: <353F4D6D.E3464BF5@contentware.com> Hi, I've been a lurker on this list for awhile, but I thought I'd add my two cents. I identify with the need to keep SAX simple, but only because it helps rapid adoption in the beginning in order to make it a defacto standard (because it is easy for parser writers to implement). A huge interface like JDBC would take an effort that many freeware authors wouldn't embark on. So I think David has done Java/XML programmers a wonderful service by organizing the effort. However, I think there is a need for a ubiquitous, parser independent, API that gives one complete control over their data structures, but without any (or negligable) information loss. Larry Wall gave a convincing presentation at XML98 as to how Perl will support XML which made my mouth drool compared to the level of information I'm getting now. In fact, I built my application around Lark instead of SAX because I needed access to location offset information. There are a whole class of applications that are impossible to write with SAX, namely, authoring tools, or any tools that need two-way manipulation. Another class of applications need access to DTD information. Right now, it seems only IBM's XML for Java supports access to the DTD, however, it does not give location offset information. Thus, it's looking more and more like parser features are going to diverge, which means SAX has two possible future scenarios: 1) All SAX features are required to be implemented, ala OpenGL, etc. 2) Some SAX features can be unimplemented, but an interface is available to query whether the functionality exists (DirectX, JDBC, etc) The first scenario makes it easier to write applications, but is a disincentive to parser implementors. The second scenario makes it easier for parser implementors to support, but introduces complicated choices for the application programmer. (Query driver to see if feature X is enabled, disable all application features dependent on X. Filter out driver choices that don't support Y, etc) My gut feeling is, I have to say, that scenario 1 is the best, for several reasons. First, there are always more application authors than parser implementors. Most of the work should be shifted onto the area that will provide the greatest benefit for the greatest number of people. For instance, it is better to make the operating system or foundation class do the work for a programmer, to free his time for other things. Second, parsers are commodity items. They will quickly be built into operating systems, browsers, and frameworks. I don't believe programmers will "shop around" for a parser. They will use the one that comes with their environment, so it is better that it is full featured and support a general API. Third, APIs that have undefined behavior, or behavior that is optional, cause wasted programming logic, when differences are eventually eradicated anyway. The implementations that support 100% of the functionality end up winning, and either the partial implementations become full ones, or they disappear. You can see this happening in the 2D and 3D video card markets right now. Thus, I think it makes sense to define a powerful API that allows a spectrum of applications to be written. I know some critics are going to respond "well, that's not the point of SAX." My gut feeling is that if XML is going to be foundational, a common, single API will be critical to the success of enabling a market of XML applications. Now, either proprietary parsers themselves will become this defacto API (e.g. Microsoft, or if Sun were to include a parser in the next JDK), or, there is going to be a standard API that everyone adheres to. Perhaps the question is, should the W3C/IETF be doing this, or should it be informal? Since I need this API *now*, actually yesterday, I'd rather not wait for the W3C/IETF to define it, rather, I like Dave's model of getting quick consesus and shipping a beta implementation. Otherwise, I'm either going to hack the source to someone else's parser, or write my own. Ok, now that I've started a flame war and gotten that off my chest :), I'd like to nominate the three biggest features I'd like in SAX Level 2 (or SAX2.0), in order of importance. 1) access to DTD information 2) comments, CDATA, and location information for Attributes 3) sax.util classes that take an ElementFactory (which return DOM interfaces), and build a tree. (maybe Don Park would like to contribute this). IBM's XML for Java is a starting point, but it has the fatal flaw that the return values of the ElementFactory are not the DOM interfaces (such as Element or PI) but IBM base classes, like TXElement or PI, which means you are forced to inherit from TXElement instead of just implementing Element. Ok, flame away! -Ray xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Apr 23 16:45:04 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:41 2004 Subject: Inheritance in XML [^*] Message-ID: <3.0.32.19980423074329.00b25c10@pop.intergate.bc.ca> At 03:51 PM 4/23/98 +0200, james anderson wrote: >the recommendation does, in deed, assert a semantic for xml documents. ... >the xml recommendation defines a two relations among elements (subsumption and >precedence), defines a (yes) language for describing these relations (the dtd >entities), asserts three states for documents (valid, invalid, and >unspecified), and specifies how to infer which state a document is in based >its content. > >this is a "semantic".... Reasonable people may disagree. I believe that sequence and containment are purely syntactic in nature and imply no semantic whatsoever. Similarly I see no "semantic" in asserting that my butt is currently placed on top of a chair, or that this chair is currently placed in front of my computer. >it is disheartening to read where attention is deflected from the issue by >claiming that no semantic was intended. Get real. You may choose to argue that containment and sequence constitute, in some philosophical framework, "semantics", but the claim that no semantic was *intended* is unchallengeable because in point of fact when we wrote the spec we considered that what we were describing was syntax. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lisarein at finetuning.com Thu Apr 23 17:18:06 1998 From: lisarein at finetuning.com (Lisa Rein) Date: Mon Jun 7 17:00:41 2004 Subject: Inheritance in XML References: <199804230428.VAA19547@boethius.eng.sun.com> Message-ID: <353F612A.C8613924@finetuning.com> > [Matthew Gertner:] > > | In last month's Wired, XML made it into the "hype list" with the > | comment that we crazy XML types are kidding ourselves because XML will > | never fly without well-defined semantics. > Hey sorry -- me and Wired were fighting for a few weeks (over the fact that they didn't link to the W3C site ANYWHERE in my XML/Perl article) and they took a jiggly whack at the topic in my absence :-) (she said....feeling slightly vindicated...) I'm back at my post now. And (since I've already offended anyone who will be by my trite comments) as we speak I'm doing a story on the pseudo-rivalry between RDF Schema and XML Data. Jon, I'm assuming I can mention the wonderful news in my story... (please) lisa xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecom.mixx.de Thu Apr 23 18:01:29 1998 From: James.Anderson at mecom.mixx.de (james anderson) Date: Mon Jun 7 17:00:41 2004 Subject: Inheritance in XML [^**] References: <3.0.32.19980423074329.00b25c10@pop.intergate.bc.ca> Message-ID: <353F6627.50A48EC7@mecom.mixx.de> hello again, Tim Bray wrote: > > At 03:51 PM 4/23/98 +0200, james anderson wrote: > >the recommendation does, in deed, assert a semantic for xml documents. > ... > >the xml recommendation defines a two relations among elements (subsumption and > >precedence), defines a (yes) language for describing these relations (the dtd > >entities), asserts three states for documents (valid, invalid, and > >unspecified), and specifies how to infer which state a document is in based > >its content. > > > >this is a "semantic".... > > Reasonable people may disagree. I believe that sequence and > containment are purely syntactic in nature and imply no semantic > whatsoever. reasonable people also agree. i share your belief in this point. > Similarly I see no "semantic" in asserting that my butt > is currently placed on top of a chair, or that this chair is currently > placed in front of my computer. we also agree on that. it is, however, a "semantic" when criteria are provided whereby the content of the dtd permits one to assert that certain sequence and containment relations are valid and others are not. the semantic arises when you have a description which says that your butt on the chair and the chair in front of the computer conforms to osha workplace guidelines together with guidelines for applying the descriptions. that's your dtd. > > >it is disheartening to read where attention is deflected from the issue by > >claiming that no semantic was intended. > > Get real. i am. the problems with the dragging discussions on "inheritance" will persist at least to the point where one recognizes: what semantic is entailed by the recommendation as it stands; what possible semantics are not; that "inheritance" has a meaning only in the context of a semantic for operations and for a store; and what must be added to the semantic in order that one can talk about using xml to serialize descriptions which denote inheritance relations. that's all i'm saying :) > You may choose to argue that containment and sequence > constitute, in some philosophical framework, "semantics", i did not and do not. i observe that, when taken together with other aspects of the xml recommendation, they do, in fact, constitute a semantic. > but the claim > that no semantic was *intended* is unchallengeable because in point of > fact when we wrote the spec we considered that what we were describing > was syntax. -Tim i am not concerned with the *intention*, as such. i have no direct knowledge of what the authors intended. to be honest, the intention itself doesn't matter. my discussion is (and was) aimed at the results, and perhaps indirectly at the effect of discussing intention. i am concerned that is not productive to abide by what may have well been the intention, or to use it as a point of argument now that the recommendation stands. a semantic is now "part of the problem". it does not matter, that it was not before. bye for now, xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu Apr 23 19:44:16 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:41 2004 Subject: Semantics (was Re: Inheritance in XML [^*]) References: <3.0.32.19980423074329.00b25c10@pop.intergate.bc.ca> Message-ID: <353F774F.69CB0F59@technologist.com> If XML had no semantics, then XSL, XLL and the DOM would have to explicitly describe the mapping from syntactic features to the abstract nodes that they work on. But they do not, because XML has semantic concepts like "element, "element type", "notation" and "attribute" that are *described by* the syntax. Here's what a language with no semantics looks like: a -> b"q"c a -> ca b -> "d" c -> "e" Even given a parse tree, you can't do anything interesting with this language, because it has no semantics. But you can do lots of interesting stuff with a "raw" XML parse tree, even if you do not know its DTD. For instance you can build a DOM from it, apply a stylesheet to it, check its validity, check its conformance to an XML-Data schema and so forth. I think that what Tim and Jon mean to avoid is a battle royale over how elements, attributes etc. fit into various ontological philosophies. I don't think that that avoidance is useful, but I understand the motivation. Nevertheless, I feel it is not accurate to claim that XML is semantic-free. There are tons of semantics, both subtle ("element type") and explicit ("initiate this network transaction in response to this markup.") Consider: "validity constraint: A rule which applies to all valid XML documents. Violations of validity constraints are errors; they must, at user option, be reported by validating XML processors" How can we tell a processor that it must trigger a *side-effect* with a legitimate (but not valid) document, and then claim that we are not describing sematics? There are other things like this in the XML spec: "When an XML processor recognizes a reference to a parsed entity, in order to validate the document, the processor must include its replacement text" -- now we're initiating network transactions. That's a semantic? "If there are no external markup declarations, the standalone document declaration has no meaning." -- that would imply it already had meaning. I don't believe that there is a distinction between "meaning" and "semantics." "If a non-validating parser does not include the replacement text, it must inform the application that it recognized, but did not read, the entity." -- a constraint on the interface between processors and applications. That's a semantic. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Apr 23 19:53:11 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:42 2004 Subject: SAX Level 2 (was 1998-04-20 Pre-Release...) In-Reply-To: <353F4D6D.E3464BF5@contentware.com> References: <014b01bd6e89$a52badc0$2ee044c6@donpark> <353F4D6D.E3464BF5@contentware.com> Message-ID: <199804231749.NAA00288@unready.microstar.com> Ray Cromwell writes: > However, I think there is a need for a ubiquitous, parser > independent, API that gives one complete control over their data > structures, but without any (or negligable) information loss. Larry > Wall gave a convincing presentation at XML98 as to how Perl will > support XML which made my mouth drool compared to the level of > information I'm getting now. In fact, I built my application around > Lark instead of SAX because I needed access to location offset > information. Location information wasn't available in the January draft, but it will be there in the release that I'm finishing up right now. I expect that much of what you want (including DTD information, which you mention elsewhere in your message) will be there in the DOM, and that authoring tools will naturally gravitate towards the DOM, since they often (usually?) need to have an XML tree model anyway. [...] > 1) All SAX features are required to be implemented, ala OpenGL, etc. > 2) Some SAX features can be unimplemented, but an interface is available > to query whether the functionality exists (DirectX, JDBC, etc) I also tend towards (1), with a few provisos: 1) there is no requirement that SAX parsers support any particular requested locale for messages; in other words, it is always acceptable for the parser to throw a SAXException when the user invokes Parser.setLocale() to request an explicit locale; 2) SAX parsers are encouraged but not required to provide a Locator for document events with DocumentHandler.setDocumentLocator(); 3) Non-validating parsers do not have to use the DocumentHandler.ignorableWhitespace() callback (though they may if they wish, as AElfred does); 4) it is at the discretion of the SAX parser what events (if any) are fired after the parser has invoked ErrorHandler.error() or ErrorHandler.fatalError() -- however, it may be that the XML 1.0 spec requires ErrorHandler.error() _not_ to kill the parse, at least for validity errors (validity errors are reportable only at user option); and 5) a parser that does not use the DTD need not report notation and unparsed entity declarations (I need to look into this further). > I'd like to nominate the three biggest features I'd like in SAX Level 2 > (or SAX2.0), in order of importance. > > 1) access to DTD information Bingo -- this would be valuable to me too. I'm not certain, though, if it makes sense to provide this through SAX or if we need to wait for the DOM. > 2) comments, CDATA, and location information for Attributes Yes, this is a big one for authoring transformations (as opposed to downstream production transformations, which the author will never see). By "location information," do you mean the order of specification, or whether an attribute was specified or defaulted? > 3) sax.util classes that take an ElementFactory (which return DOM > interfaces), and build a tree. (maybe Don Park would like to contribute > this). IBM's XML for Java is a starting point, but it has the fatal flaw > that the return values of the ElementFactory are not the DOM interfaces > (such as Element or PI) but IBM base classes, like TXElement or PI, > which means you are forced to inherit from TXElement instead of just > implementing Element. Sounds good, but since this could be built on top of SAX instead of within it, I'll probably bow away from it. Now, I need to stop talking about level 2 and finish level 1. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Thu Apr 23 20:18:13 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:00:42 2004 Subject: Nesting XML based languages and scripting languages In-Reply-To: <199804230220.TAA01901@earth.sun.com> (message from Michael Amster on Wed, 22 Apr 1998 19:20:32 -0700) Message-ID: <199804231815.LAA19712@boethius.eng.sun.com> | I don't see from the current XSL approach how you will overcome this | fundamental problem. Maybe my newness to XML prevents me from seeing | how this may be solved via a DTD mechanism like architectural forms, | but I'm still reading David's book. I'm not at liberty to disclose the details of the XSL WG's current design work. All I'm telling you is that you are not going to get a very accurate idea of what XSL is capable of doing based on last year's submission. You will get a much better picture in July. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dvp4c at jefferson.village.virginia.edu Thu Apr 23 20:23:51 1998 From: dvp4c at jefferson.village.virginia.edu (Daniel Pitti) Date: Mon Jun 7 17:00:42 2004 Subject: XLink comment and queries Message-ID: <3.0.1.32.19980423141143.00a1e490@jefferson.village.virginia.edu> Steve and Eve, I have an editorial comment/question concerning one example in XLink, section 4.3, and a couple of questions. First, the editorial comment: Should each of the locator elements in the following example have "xml:link="locator": Now my queries. I have been trying to envision how XLink will work, both with respect to rendering, but also with respect to creation and maintenance. While I realize that the object of the proposal is not to provide programming specifications, I am hoping that you will be willing to share, off-the-record if you like, how you envision applications working with XLink. I have no problem with the purpose of XLink with respect to instances that travel around without DTDs, but I am having trouble understanding how you envision the future of link and entity management given that XLink does not employ idref/s and entity/ies declarated attribute values. My inability to envision how link and entity management will work leads me to want to see XLink as a specification for XML documents when exported outside of a creation and management environment, but not in a creation and maintenance environment. Thus, in trying to XMLize an existing SGML DTD, I find myself wanting my linking elements to have both idref/s and entity/ies for use in creation and maintenance, and XLink attributes for exporting the same, the latter be created at the time of exporting. Is this sensible at this juncture? Or am I overlooking something? I'll be happy to elaborate if necessary. Thanks, Daniel Daniel V. Pitti Project Director Institute for Advanced Technology in the Humanities Alderman Library University of Virginia Charlottesville, Virginia 22903 Phone: 804 924-6594 Fax: 804 982-2363 Email: dpitti@Virginia.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Thu Apr 23 20:27:58 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:00:42 2004 Subject: Semantics (was Re: Inheritance in XML [^*]) Message-ID: <012d01bd6ee5$9b808820$1e09e391@mhklaptop.bra01.icl.co.uk> >If XML had no semantics, then XSL, XLL and the DOM would have to >explicitly describe the mapping from syntactic features to the abstract >nodes that they work on. But they do not, because XML has semantic >concepts like "element, "element type", "notation" and "attribute" that >are *described by* the syntax. If those are semantic concepts, then what are the syntactic concepts? The syntax of a language defines which sequences of symbols constitute legitimate sentences in that language. The semantics of the language define how to interpret the sentences of the language as statements about objects in some external world. So if you take my GedML as an example, the syntax of GedML is defined partly by the XML specification and partly by the GedML DTD. The semantics of GedML are the rules that say an element asserts the existence of a person and a element asserts the existence of a married or unmarried couple, etc. Of course some of the semantic constraints (a person can only have one father) can be expressed as syntactic constraints, while others (a person must be born after their father) cannot, because the syntactic metalanguage used for XML is insufficiently powerful. Tim is surely right: there are no semantics implied by the XML spec. XSL, XLL, etc don't change this. They define syntactic (or symbolic) manipulations of XML constructs which are probably most useful if you used XML the way its designers expected you to, but they do not make any assumptions about the semantics of your XML content. This would become obvious if the XSL spec were rewritten (as it could be) in a purely formal way to define a transformation from one sequence of symbols into another. Mike Kay, ICL xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From siping.liu at gte.com Thu Apr 23 20:30:45 1998 From: siping.liu at gte.com (Siping Liu) Date: Mon Jun 7 17:00:42 2004 Subject: help me understand DOM XML Level 1 Message-ID: <353F8863.EAD6E79A@gte.com> Hi, I'm trying to implement the DOM interface on the listener side of the SAX interface. The DOM Core Level 1 interface looks clear enough to me. But I get confused by the definition in DOM XML Level 1, especially "XMLNode" -- why isn't it derived from "Note"? Can I traverse a tree with this interface and how? Why does it have this "getEntityReference()" method, I don't see the concept of having an "entity reference" attached to each node (such as an element) in XML spec. Thanks for your help. Siping Liu siping.liu@gte.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Thu Apr 23 20:57:07 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:00:42 2004 Subject: XLink comment and queries Message-ID: <3.0.32.19980423135016.00759fa8@postoffice.swbell.net> At 02:11 PM 4/23/98 -0400, Daniel Pitti wrote: >Thus, in trying to XMLize an existing SGML DTD, I find myself wanting my >linking elements to have both idref/s and entity/ies for use in creation >and maintenance, and XLink attributes for exporting the same, the latter be >created at the time of exporting. Is this sensible at this juncture? Or am >I overlooking something? My standard advice in this type of situation is as follows: 1. Plan to always have a "publish" step from your authoring/archiving repository to the outside world. 2. As part of the publishing process, plan to do whatever transforms are necessary to make your information usable by its recipients. Even if today it's an identity transform or just SX or SGMLNORM, you should put it place so you have a well-defined process step you can expand later. 3. Come to understand that the form of addressing used in any situation is determined entirely by the practical considerations in the use scenario and should have *nothing* do with the rhetorical or presentational semantic of the element. Addressing is plumbing and, while of keen interest plumbers, the form of pipe used does not normally affect the "meaning" of the house. You can change from copper to PVC pipes without affecting the relationships among the rooms in your house. One key implication of these three recommendations is that the form of addressing used in a document at any given time will be determined by the requirements of its use context and *is subject to change*. In other words, the form of addressing use over the life of a document will very likely change. Your DTDs and management systems should expect that. For example, during authoring you want a very flexible, easy-to-manage addressing method, so you will probably do something like use queries unique to your repository, repository-wide unique IDs, HyTime-style indirect addressing, etc. XPointers are not, with a few exceptions, appropriate during authoring because they are too direct. In particular, they bind the specification of the ultimate target too close to the initial reference, which always impairs managability (one reason that HyTime started with completely indirect addressing). Thus, you are unlikely to decide to use XPointers during authoring. But, XPointers are very well suited to delivery of static (or mostly static) documents. In particular, because they are very direct, they are easy to implement and resolve. Thus, you are likely to decide to use XPointers for delivery of information out of your respository. This suggests two things: 1. Your DTD should enable the use of a variety of addressing methods. The HyTime standard provides syntax that lets you declare in a DTD or instance what forms of address are being used. [I'll post a separate note with an example.] 2. You should plan to do an address transform as part of your publishing process, transforming the forms of address used for authoring and archiving to the forms used for delivery (which may vary based on the expected recipient). Cheers, Eliot --
W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Apr 23 21:00:33 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:42 2004 Subject: Semantics (was Re: Inheritance in XML [^*]) Message-ID: <3.0.32.19980423115844.00a1a790@pop.intergate.bc.ca> At 01:15 PM 4/23/98 -0400, Paul Prescod wrote: >If XML had no semantics, then XSL, XLL and the DOM would have to >explicitly describe the mapping from syntactic features to the abstract >nodes that they work on. But they do not, because XML has semantic >concepts like "element, "element type", "notation" and "attribute" that >are *described by* the syntax. Well, we just have a difference of perception. I think that "element", "element type", "notation", and so on are profoundly *syntactic* constructs. I think an element is a piece of an XML document that is bounded by tags; an entity is a chunk of text that is either provided literally or referred to via URL. It is true that the spec provides operational rather than purely grammatical descriptions of some aspects of things, but that is largely for convenience. Dan Connolly has argued repeatedly and forcefully that the spec could be completely re-written to avoid discussion of the processor's actions (he is right) and that this would be an improvement (I'm not convinced). The fact that the XML processor has a couple of required *behaviors*, most notably error handling, does not constitute anything like what I think of in connection with the term "semantic". I suppose you can argue that declaration in DTDs do have a semantic of grammatical constraint. OK, granted. But in the instance, Elements and attributes don't mean anything in and of themselves. They doubtless have semantics that are used by humans and computer programs in particular application domains, but that's none of our business. And finally... words are only of use in facilitating human communication when there is some shared understanding as to their denotation and connotation. The term "semantic", judged by this standard, has clearly and empirically lost its usefulness in this discussion. But of this I am confident: elements, attributes, and entities don't mean anything in and of themselves. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Thu Apr 23 21:08:05 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:00:42 2004 Subject: Inheritance in XML In-Reply-To: <01bd6e84$30542040$020b0ac0@xerius> (matthew@praxis.cz) Message-ID: <199804231905.MAA19742@boethius.eng.sun.com> [Matthew Gertner:] | Thanks for the info. This is interesting stuff indeed. BTW: I hope you | are wrong about good energy being wasted. I at least feel I learned a | lot from this discussion. It was a great discussion. I just didn't want people thinking that they had to start designing a schema mechanism in a mail list when it's already on its way to being a work item for a W3C working group. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu Apr 23 21:20:56 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:42 2004 Subject: Inheritance in XML [^*] References: <199804230428.VAA19547@boethius.eng.sun.com> <353F4751.1E76D25F@mecom.mixx.de> Message-ID: <353F94A5.C9798BD1@technologist.com> james anderson wrote: > > > where the property may concern type or it may concern class. that is, > with respect to a given processing and storage model the respective properties > may concern which operations may be be performed on the respective instances > (read "subtyping") or they may concern what storage relationships the > respective instances entail (read "subclassing"). What do you mean by storage relationships (and thus classes) in the context of XML? > these are two, in principle, independant properties, each of > which may be inherited. What are two properties? Type and class? Did you shift from discussing properties "concerning" type and class and type and class *as* properties. Can you also please defend the distinction between type and class? It makes sense in object oriented programming languages (mostly for performance reasons), but I don't know that there is any such distinction in common usage or in most ontologies. I am prepared to be convinced otherwise, but I think that the class/type distinction is specific to OOP and is not useful except as an arbitrary distinction, to avoid confusion, as it is used in the DSSSL spec. (node class, flow object class vs. element type ... you could as easily reverse them and talk of element classes and node types...) > this comes to light, for instance in the assertion elsewhere in this thread, > that architectural forms have nothing to do with inheritance. it is true, that > they have only halfway to do with "class inheritance" - that is the storage > structure implied by asserting that a given element conforms to one or more > architectural forms. ( i think i claimed, in an earlier post, that they > "punted" on this one. i still think that's true.) on the otherhand, they have > everything to do with "type inheritance". that is, by virtue of asserting a > relation between a concrete element and an archtectural element, certain > operations which lead to a "valid" document state for the architectural > element are produce the same state for the given concrete element. This is too abstract for me to follow. I don't think tha architectural forms assert a relation between a concrete element and an architectural element. Rather, I feel that they assert conformance, after a transformation, of an element to an architectural element *type*. You confuse me again when you talk about "operations" leading to a "valid document state." I would appreciate it if you could expand on that. > this is one form of inheritance. It is true that an architectural client element inherits semantics from an architectural element. As I have said before, the word inherit is very vague, despite your relatively precise definition. The problem is that the word "property" is very vague. For instance, according to that definition, this DTD "inherits" from HTML: %HTML; ]> because it inherits the has-img-element property from HTML. > that given, if one claims - and is satisfied, that xml has by intention no > sematic, and does not recognize that (without concern for an "application > semantics") it is essential to provide at least a "dom semantics", then any > discussions concerning serialized representations regarding assertions about > types, classes, and inheritance (ie "schemas") will be ungrounded and bound to > be futile. XML has semantics on at least four levels: First, XML documents have well-defined semantics as SGML documents. Second, the XML spec. specifies many behaviours for processors. Third, whether the XML spec. enforces it or not (and we could argue about that), everyone who reads and understand the XML spec. comes away from it with a similar understanding of its type system (such as it is). We *could* pretend that that type system is just a mechanism for specifying syntax (though I think that this is disingenuous), just as we could pretend that Rosebud is a sled and Animal Farm is a children's story about animals. Fourth, at least four specs. depend on XML's "non-existent" semantics. If they are to become RECs, someone will have to specify these semantics explicitly, or admit that they already exist. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Thu Apr 23 21:23:12 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:00:42 2004 Subject: Semantics (was Re: Inheritance in XML [^*]) Message-ID: <3.0.32.19980423135725.00716e98@postoffice.swbell.net> At 07:28 PM 4/23/98 +0100, Michael Kay wrote: > Paul Prescod wrote: >>If XML had no semantics, then XSL, XLL and the DOM would >have to >>explicitly describe the mapping from syntactic features to >the abstract >>nodes that they work on. But they do not, because XML has >semantic >>concepts like "element, "element type", "notation" and >"attribute" that >>are *described by* the syntax. > > >If those are semantic concepts, then what are the syntactic >concepts? tag, attribute specification, character data, entity reference. One problem in both the XML and SGML specifications is that the relationship between the syntactic tokens the language defines and the abstract objects intended to be constructed from those tokens is underspecified. Cheers, Eliot --
W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From robnett at cig.mot.com Thu Apr 23 21:33:08 1998 From: robnett at cig.mot.com (Scot Robnett) Date: Mon Jun 7 17:00:42 2004 Subject: unsubscribe Message-ID: <199804231943.PAA13793@po_box.cig.mot.com> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Thu Apr 23 21:45:50 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:00:42 2004 Subject: XLink comment and queries Message-ID: <3.0.32.19980423144001.0075dd40@postoffice.swbell.net> At 02:11 PM 4/23/98 -0400, Daniel Pitti wrote: >Thus, in trying to XMLize an existing SGML DTD, I find myself wanting my >linking elements to have both idref/s and entity/ies for use in creation >and maintenance, and XLink attributes for exporting the same, the latter be >created at the time of exporting. Is this sensible at this juncture? Or am >I overlooking something? In a previous note, I mentioned that the HyTime architecture provides a syntax for declaring what form of addressing is being used in a given instance. This facility lets you be 100% clear about how to interpret the values of attributes that you know are referential, lets generalized software manage documents that use a variety of addressing methods, and makes it clearer that single element types can use a variety of addressing methods. The facility is the "reference location address" (refloc) facility (clause 7.8). It provides two attributes that can be used on any element: - loctype (location type), which associates referential attributes with their addressing methods (notations) - rflocsrc (reference location source), which associates pairs of referential attributes to indicate that one attribute addresses the location source for the other (e.g., book= and refid= as often used with DynaText). With these two attributes, you can completely characterize the referential attributes used by a given element such that a generalized system that implements the refloc facility and implements the addressing methods used will be able to resolve the addresses. XLink doesn't need to provide such a mechanism because it requires the use of exactly one addressing method: URIs and XPointers. However, in an environment where you want more flexibility you do need this sort of facility. For example, a generic linking element that can use any form of addressing, including, but not limited to, XPointers, can be declared like so: The value of the loctype attribute is a list of tupples (either pairs or triples depending) where the first token is an attribute name, the second token is a location type keyword as defined by the HyTime architecture, and the third token, if required is a notation name (as shown in the next example). Thus, this value of the loctype attribute declares the href attribute to be, semantically, an "IDLOC", which means that the attribute is interpreted as though it had a value prescription of "IDREF" or "IDREFS". Other choices are "ENTLOC" (entity references), "TREELOC", (tree location address), and so on, corresponding to the build-in addressing methods of HyTime. You can also refer to non-HyTime-defined addressing methods, as shown below. [Note that here the HyTime architecture is being used *only* for the loctype attribute. The GenericLink element is not declared as a HyTime hyperlink. This is an example of using an architecture that provides a "global" attribute. The "hybrid" element form is HyTime's generic element that has no meaning other than HyTime-specific processing will be applied. Thus any element can be mapped to the hybrid form in order to associate it with the generic facilities of the HyTime architecture, such as the loctype attribute. You can also use this technique to characterize attributes named "ID" as being unique IDs because the HyTime architecture says an attribute named ID is an ID unless you say otherwise. Because all facilities of the HyTime architecture are optional, a processor can support just the loctype attribute and be a conforming HyTime application--support for loctype does not require or imply support for any other facilities of HyTime.] An instance of the GenericLink element type might look like this: ...
...
... See Chapter 1 ... To deliver this document as an XML document using XPointers, you would transform the declaration as follows: The QUERYLOC keyword indicates that the addressing method is some form of "query", that is, something not defined by SGML or HyTime. The QUERYLOC keyword must be followed by the name of the notation that governs the interpretation of the query value. In this case, I've declared a notation for XPointers (I made up the public ID as one has not yet been defined by W3C for XPointers as far as I know). Note that the base declaration of the href attribute has not changed, nor has the link type or the HyTime architectural mapping. We've only changed the declaration of how its value should be interpreted. The generated XML instance changes by transforming the ID reference into the equivalent XPointer specification: ...
...
... See Chapter 1 ... Finally, note that the *semantic* of the reference is still defined by the processing application. The fact that an attribute is referential does not, in and of itself, make the element a hyperlink. Thus, the loctype attribute serves to declare that a given attribute is, in fact, referential and how to resolve the reference (interpret its value) but does not say *why* it's referential. For that you would need to appeal to some other mechanism, such as XLink: If you understand the rules imposed by the XLink architecture, then you know that not only is href referential, but that its purpose is to address the remote resource of a hyperlink. You didn't know that second part before. Notice that I've now combined the use of two independent architectures in one element. Processors that only understand XLink will ignore the loctype, linktype, and HyTime attributes. Processors that only understand HyTime will ignore the xml:link attribute and processors that understand both will have redundant information about what the href attribute is all about (because it is implicitly a reference interpreted as an XPointer by XLink's rules an explicitly so as declared by the loctype attribute). Cheers, Eliot --
W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu Apr 23 21:48:40 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:43 2004 Subject: SAX Level 2 (was 1998-04-20 Pre-Release...) In-Reply-To: <353F4D6D.E3464BF5@contentware.com> References: <014b01bd6e89$a52badc0$2ee044c6@donpark> Message-ID: <3.0.1.16.19980423194237.c84775ae@pop3.demon.co.uk> At 10:17 23/04/98 -0400, Ray Cromwell wrote: >Hi, I've been a lurker on this list for awhile, but I thought >I'd add my two cents. Thanks very much Ray. > >I identify with the need to keep SAX simple, but only because it helps >rapid adoption in the beginning in order to make it a defacto standard >(because it is easy for parser writers to implement). A huge interface >like JDBC would take an effort that many freeware authors wouldn't >embark on. So I think David has done Java/XML programmers a wonderful >service by organizing the effort. Fully agreed. I think that questions came up during the SAX process that no-one had anticipated at the start. I suspect that most people (like me) thought that SAX would be a 'simple' subset of the DOM and that the main effort was to determine where the cut would be. > > >However, I think there is a need for a ubiquitous, parser independent, >API that gives one complete control over their data structures, but >without any (or negligable) information loss. Larry Wall gave a >convincing presentation at XML98 as to how Perl will support XML which >made my mouth drool compared to the level of information I'm getting >now. In fact, I built my application around Lark instead of SAX because >I needed access to location offset information. I think others have answered this - the DOM is intended to represent the document without significant loss. [There may be discussion about the exact lexical input - e.g. was LF or CRLF used, etc.] The SAX experience has shown - I think - that a lot of issues get raised for the first time during the design process - such as exceptions and characters - and hopefully this will ease the DOM's creation. I agree fully with David that we shouldn't try to converge on where the DOM will develop to. It's worth noting that SAX makes things *enormously* easier for most application-writers. Yes, there may be some information loss, but many applications simply want to use the 'content' of the document. In those cases only about 5 methods are required. And I will certainly sleep happier knowing that the problems of Locale, etc. have been addressed even though I doubt I shall need them personally. > >There are a whole class of applications that are impossible to write >with SAX, namely, authoring tools, or any tools that need two-way >manipulation. Another class of applications need access to DTD >information. Yes. SAX was never intended to support the full power of authoring tools. > >Right now, it seems only IBM's XML for Java supports access to the DTD, >however, it does not give location offset information. Thus, it's >looking more and more like parser features are going to diverge, which >means SAX has two possible future scenarios: An exciting aspect of XML is that we move into new territory. How DTDs will be used will depend on the coming generation of XML authors. For some applications they are critical - for others there may be de facto approaches which avoid needing them. [Example. If everyone restricts the use of attname ID to refer to attributes of type ID, then even without the DTD this (implied) semantics might be supported by a wide range of tools. Some will find this horrible, others will see it as natural - applications will have to add lots of other semantics.] > [...] >Ok, now that I've started a flame war and gotten that off my chest :), >I'd like to nominate the three biggest features I'd like in SAX Level 2 >(or SAX2.0), in order of importance. No you haven't :-) We don't have flame wars on XML-DEV. If a topic looks like addressing a critical point we try to work out what needs to be done and whether XML-DEV is the right place for it. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu Apr 23 21:48:45 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:43 2004 Subject: Inheritance in XML In-Reply-To: <199804230428.VAA19547@boethius.eng.sun.com> References: <01bd6c3a$c50f2910$020b0ac0@xerius> Message-ID: <3.0.1.16.19980423192012.c74fab34@pop3.demon.co.uk> At 21:28 22/04/98 -0700, Jon Bosak wrote: [...valuable contribution snipped...] > >The answer is, Yes, there are other people who think that it would >make sense to design an XML schema mechanism to handle issues like >what has been called "inheritance" in this discussion (not to mention >good old-fashioned data typing). The workings of a W3C committee can >be made public only at the discretion of the chair of the committee, >so I will put on my official XML WG Chairman hat and reveal unto ye >that the XML WG has officially requested that the job of defining a >schema language for XML documents be added to its charter. If >approved by the W3C Director, this work would certainly involve a >consideration of most of the issues raised in this discussion and >would include a close look not only at XML Data but also at other >proposed solutions to the same problem. I am extremely pleased to see this announcement and hope that XML-DEV can act supportively. I am 100% in favour of 'no semantics' in XML. The only concern I have about that philosophy is that the community might have many different unconnected approaches to developing semantics which would significantly decrease interoperability. Jon's announcement should help developers plan their timescales for creating their own semantics. There is little to be gained by doing it too early if the rest of the community is going to have a well-crafted, publicly developed approach in a few months. Like many others, I need some of the components of XML-data and so whatever I do at present I have designed to be 'throw-away' when the W3C deliberations start coalescing. It's worth remembering that this is probably the first global public exercise in developing semantics other than in specialised domains. There is a great deal to do - and the 'inheritance' discussion shows that it is a level of complexity tougher than syntax. Our discussions (and success) with SAX shows that it often takes repeated starts to get workable approaches. I am confident that the public discussion over the last year will prove very useful to those involved in coming up with formal suggestions. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lisarein at finetuning.com Thu Apr 23 22:22:30 1998 From: lisarein at finetuning.com (Lisa Rein) Date: Mon Jun 7 17:00:43 2004 Subject: Inheritance in XML [^*] References: <199804230428.VAA19547@boethius.eng.sun.com> <353F4751.1E76D25F@mecom.mixx.de> <353F94A5.C9798BD1@technologist.com> Message-ID: <353FA88D.21E70F1D@finetuning.com> > > What are two properties? Type and class? Did you shift from discussing > properties "concerning" type and class and type and class *as* properties. > > Can you also please defend the distinction between type and class? It > makes sense in object oriented programming languages (mostly for > performance reasons), but I don't know that there is any such distinction > in common usage or in most ontologies. I am prepared to be convinced > otherwise, but I think that the class/type distinction is specific to OOP > and is not useful except as an arbitrary distinction, to avoid confusion, > as it is used in the DSSSL spec. (node class, flow object class vs. > element type ... you could as easily reverse them and talk of element > classes and node types...) It was my understanding that they are all just different names for the same "things". At least they are for RDF, right? In fact, in the RDF Schema group, as far as the typing (classing) models were concerned, types and classes were exactly the same (so much so, in fact, that we went back and forth on which to call them and, conceptually anyway, the words became so interchangible -- Resulting in a "type system" of classes :-). Also, the types/classes and their respective resources (of which RDF's mission is to describe) WERE classes and node types (that was my understanding anyway...). Just when I thought I finally had a clear understanding of the above, suddenly I'm not so sure. Someone reassure me....please. lisa xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mamster at webeasy.com Thu Apr 23 23:32:59 1998 From: mamster at webeasy.com (Michael Amster) Date: Mon Jun 7 17:00:43 2004 Subject: SAX needs from our point of view Message-ID: Quoting Ray Cromwell: >Ok, now that I've started a flame war and gotten that off my chest :), >I'd like to nominate the three biggest features I'd like in SAX Level 2 >(or SAX2.0), in order of importance. >1) access to DTD information >2) comments, CDATA, and location information for Attributes >3) sax.util classes that take an ElementFactory (which return DOM >interfaces), and build a tree. (maybe Don Park would like to contribute >this). IBM's XML for Java is a starting point, but it has the fatal flaw >that the return values of the ElementFactory are not the DOM interfaces >(such as Element or PI) but IBM base classes, like TXElement or PI, >which means you are forced to inherit from TXElement instead of just >implementing Element. In our case, having embedded XML languages with our own language controlling flow of execution, we have a real need for an accurate reproduction of the XML elements parsed so they can be rewritten correctly. Specifically, the issue is important in distinguishing between text and CDATA. Let me illustrate with a simple example: This is just text When this is reported up from a SAX parser, we do not differentiate between text and the CDATA, but let's say that we want to output the subset of arbitrary XML back out from our DOM or other object structure: This is data with &references; which should not be parsed! This is just text Now you see that the CDATA will have all references made when it is reparsed. We really do want to preserve CDATA as different from text in SAX. I can live without comments and to some degree, I can even reduce the amount of DTD info available to me, but I hope that CDATA and text are reported differently through the interface. It should not substantially complicate things for parser writers or application developers if it is just a Document handler event. -MA ~-~-~-~-~-~-~-~-~-~-~-~-~-~-WEBEASY-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~ Michael Amster mamster@webeasy.com 4676 Admiralty Way, Suite 300 Tel: 310.576.0770 Marina Del Rey, CA 90292 Fax: 310.576.2011 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu Apr 23 23:36:49 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:43 2004 Subject: Semantics (was Re: Inheritance in XML [^*]) References: <3.0.32.19980423115844.00a1a790@pop.intergate.bc.ca> Message-ID: <353FB46A.DA61D7FF@technologist.com> [last things first] Tim Bray wrote: > And finally... words are only of use in facilitating human > communication when there is some shared understanding as to their > denotation and connotation. The term "semantic", judged by this standard, > has clearly and empirically lost its usefulness in this discussion. I think that this discussion is important, because I remember how confused I was in the early days when I tried to understand this same distinction in SGML. "SGML doesn't supply semantics, just syntax." So ID/IDREF was "syntax", but HyTime links are "semantic." What??? XML has the same problem. This is not only confusing, but damaging. Suggestions for improvement can be dismissed: "that's not syntax, that's semantics" as if there were a clear line between the two, and as if XML wasn't already straddling the line (even if we decide that it is fuzzy). For example, as james points out, discussion of subtyping and inheritance is meaningless in a language with no semantics. But we seem to agree, now, that DTDs have a semantic, so we can stop beating that particular horse. Let me suggest this definition for semantic: a mapping from a syntactic feature to an abstraction. A language specified entirely in BNF does not have a semantic. A language specified at least in part in prose *might*. I argue that XML does: Tim Bray wrote: > > Well, we just have a difference of perception. I think that > "element", "element type", "notation", and so on are profoundly > *syntactic* constructs. I think an element is a piece of an XML > document that is bounded by tags; Okay, let's work from that definition. This is an element: foo Now I make an XSL rule (or DOM query, or XLL link) that works on "elements of type ABC" (according to the XSL spec.). Is the XSL spec. going to define how to get from the text above, which is syntactically an element, to an abstract object of type "ABC" with an attribute with name "FOO", value "DEF" and content "foo"? The fact that these other specs speak of "elements" and "element types" indicate that the people who make these specs. consider these things to have been defined not only syntactically, but as abstractions, in the XML REC. In other words, the string above isn't just "in the language", or "out of it." It has a particular interpretation *under it*. It describes an abstraction. Who defines the mapping from the sequence of characters to the abstraction that these other specs work on? I say that the XML spec. defines this mapping, for two reasons: #1. Everyone seems to think it does (including other people in the W3C, the editors, the people who invented SAX and so forth). Nobody is going around defining how to get from element syntax to element abstractions, so they must think that the job is already done. #2. The XML spec. *itself* uses that abstraction. How else can XML check an element against the content model and attribute constraints defined in its "type"? I suppose that there is such a thing as a completely syntactic "abstraction" (e.g. Lisp S-Expr), but it's stretching it to claim that XML is defined this way when you take into account point #1. The abstractions "persist" after the document has been validated -- they are the result of the process. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Thu Apr 23 23:49:38 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:00:43 2004 Subject: Inheritance in XML In-Reply-To: <353F612A.C8613924@finetuning.com> (message from Lisa Rein on Thu, 23 Apr 1998 08:41:30 -0700) Message-ID: <199804232147.OAA19882@boethius.eng.sun.com> [Lisa Rein:] | Jon, I'm assuming I can mention the wonderful news in my story... | (please) Sure, but be careful -- the news is that the XML WG is asking to have the schema work added to our charter, not that our request has been approved. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Apr 23 23:56:07 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:43 2004 Subject: SAX needs from our point of view Message-ID: <3.0.32.19980423145437.015d71d0@pop.intergate.bc.ca> At 02:32 PM 4/23/98 -0700, Michael Amster wrote: > This is data with &references; which should not be parsed! > ]]> > This is just text >When this is reported up from a SAX parser, we do not differentiate between >text and the CDATA, but let's say that we want to output the subset of >arbitrary XML back out from our DOM or other object structure: So, whenever you're going to output text into an XML document, and that text contains < or &, then you have to escape it. If you're going to output something and you *know* it's going to contain neither child elements nor references, there's nothing wrong with enclosing it in just to be sure. CDATA sections in SAX are nowhere near the 80-20 point. Time to stop talking and ship. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wperry at fiduciary.com Fri Apr 24 00:02:46 1998 From: wperry at fiduciary.com (W. E. Perry) Date: Mon Jun 7 17:00:43 2004 Subject: Inheritance in XML [^*] References: <199804230428.VAA19547@boethius.eng.sun.com> <353F4751.1E76D25F@mecom.mixx.de> <353F94A5.C9798BD1@technologist.com> <353FA88D.21E70F1D@finetuning.com> Message-ID: <353FB96D.96EF6E6@fiduciary.com> Lisa Rein wrote: > > > > What are two properties? Type and class? Did you shift from discussing > > properties "concerning" type and class and type and class *as* properties. > > > > Can you also please defend the distinction between type and class? It > > makes sense in object oriented programming languages (mostly for > > performance reasons), but I don't know that there is any such distinction > > in common usage or in most ontologies. I am prepared to be convinced > > otherwise, but I think that the class/type distinction is specific to OOP > > and is not useful except as an arbitrary distinction, to avoid confusion, > > as it is used in the DSSSL spec. (node class, flow object class vs. > > element type ... you could as easily reverse them and talk of element > > classes and node types...) > > It was my understanding that they are all just different names for the > same "things". At least they are for RDF, right? In fact, in the RDF > Schema group, as far as the typing (classing) models were concerned, > types and classes were exactly the same (so much so, in fact, that we > went back and forth on which to call them and, conceptually anyway, the > words became so interchangible -- Resulting in a "type system" of > classes :-). Also, the types/classes and their respective resources (of > which RDF's mission is to describe) WERE classes and node types (that > was my understanding anyway...). > > Just when I thought I finally had a clear understanding of the above, > suddenly I'm not so sure. Someone reassure me....please. Paul Prescod dealt with this accurately and conclusively in his 20 Apr post: Subject: Inheritance and subtyping in OO languages Date: Mon, 20 Apr 1998 09:20:12 -0400 From: Paul Prescod To: xml-dev I've found a good reference to the 8 year old paper that made the distinction between inheritance and subtyping most explicit. The paper itself is not online, but this summary is quite good: "[CCHO89] and [CoHC90] propose an approach based on explicit interfaces and interface containment. In this system of object interfaces, one type is considered a subtype of another if some subset of its interface is identical to that of the second. [...] Hence in this system class-based inheritance is strictly a reusability mechanism for sharing behaviour between objects, not to be confused with subtyping. For example two classes may be equivalent as types, though neither inherits anything from the other. So class hierarchies are not the same as type hierarchies, although they may overlap. Object interfaces [as in Java, C++, etc. - Paul] clarify this distinction between interface containment (subtyping) and class- based inheritance and give insight into limitations caused by equating the notions of type and class in many typed object-oriented programming languages [such as Simula 67 - Paul]." http://progwww.vub.ac.be/prog/persons/kimmens/research/Introduction-to-OO.html The paper itself is called: "Inheritance is not subtyping" and is quite famous, but unfortunately predates the Web. As Paul has pointed out, "class" is the looser term. The distinction which you acknowledge between class and type in OOP also exhibits this difference in objectivity and precision: a class is an essentially arbitrary collection of data and (data processing) methods. "Type" bespeaks the detached observation of a common characteristic. If the definition of that common characteristic is narrowed, certain examples which formerly met the criteria of that type will no longer, and will clearly have no place in a grouping of that type. With class, in C/S as in society, the criteria for inclusion are idiosyncratic and inconsistent. We can conclude only that a member may arguably belong to a class on the evidence of that member being included in at least one taxonomist's specification of that class. That said, the imprecise articulation of these two terms in discussion of the XML standard and its dependents reasonably mirrors the workable, if still inchoate, guidance which the standard offers, right now, for accomplishing useful work. Clearly the present standard will not support a useful mechanism of class inheritance. Yes, the reason it will not is rooted in imprecise definitions. Forget it for a while. There is plenty other work we can base on the existing standard right now. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Fri Apr 24 00:07:18 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:00:43 2004 Subject: Inheritance in XML [^*] In-Reply-To: <353F4751.1E76D25F@mecom.mixx.de> (message from james anderson on Thu, 23 Apr 1998 15:51:28 +0200) Message-ID: <199804232204.PAA19888@boethius.eng.sun.com> [James Anderson:] | much of the earlier parts of that posting [by Bosak], and most of this | thread, are, on the other hand, disheartening. there are two issues: | | the claims regarding "semantics": | the recommendation does, in deed, assert a semantic for xml documents. Before I can say anything, my past history as a copy editor compels me to state that there is no English noun "semantic." Sorry, I just had to point that out. It's evident from the postings by James Anderson and Paul Prescod that some people are taking the word "semantics" to mean something different from what I was responding to in the mention of the magazine article. The people who complain that XML does not have clearly defined semantics mean "semantics" the way I do -- as a synonym for "meaning". They are concerned that the meaning of a construction like Jones~Jane is not completely clear in the absence of a healthcare industry standard like HL7 PID. What I was saying is that the meaning of such a construction taken by itself is not just unclear, it really isn't there at all. I'm not saying anything one way or the other about "semantics" in the much more rarified sense used by some of the theorists in this thread. I'm only trying to clear up a lot of confusion running rampant out there among people who think that XML tag and attribute names have inherent semantics in the ordinary sense of the word. The theorists here already understand this; it's those other people throwing the word "semantics" around that I'm worried about. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From fmanola at objs.com Fri Apr 24 00:43:01 1998 From: fmanola at objs.com (Frank Manola) Date: Mon Jun 7 17:00:43 2004 Subject: Inheritance in XML [^*] In-Reply-To: <353FA88D.21E70F1D@finetuning.com> References: <199804230428.VAA19547@boethius.eng.sun.com> <353F4751.1E76D25F@mecom.mixx.de> <353F94A5.C9798BD1@technologist.com> Message-ID: At 1:46 PM -0700 4/23/98, Lisa Rein wrote: >> >> What are two properties? Type and class? Did you shift from discussing >> properties "concerning" type and class and type and class *as* properties. >> >> Can you also please defend the distinction between type and class? It >> makes sense in object oriented programming languages (mostly for >> performance reasons), but I don't know that there is any such distinction >> in common usage or in most ontologies. I am prepared to be convinced >> otherwise, but I think that the class/type distinction is specific to OOP >> and is not useful except as an arbitrary distinction, to avoid confusion, >> as it is used in the DSSSL spec. (node class, flow object class vs. >> element type ... you could as easily reverse them and talk of element >> classes and node types...) > >It was my understanding that they are all just different names for the >same "things". At least they are for RDF, right? In fact, in the RDF >Schema group, as far as the typing (classing) models were concerned, >types and classes were exactly the same (so much so, in fact, that we >went back and forth on which to call them and, conceptually anyway, the >words became so interchangible -- Resulting in a "type system" of >classes :-). Also, the types/classes and their respective resources (of >which RDF's mission is to describe) WERE classes and node types (that >was my understanding anyway...). > >Just when I thought I finally had a clear understanding of the above, >suddenly I'm not so sure. Someone reassure me....please. > Whether there's a distinction between "type" and "class" depends on the context. Loosely speaking, as well as in many contexts, they're generally considered synonymous. RDF uses them pretty much interchangeably. As Paul suggests, the distinction is often made in the context of OOPLs. In this context, roughly speaking, the term "type" is often used to mean a protocol shared by a group of objects, while the term "class" is often used to mean an implementation shared by a group of objects. The same distinction can be important in other contexts (e.g., when talking about implementation aspects of distributed objects) when you need to distinguish between an interface that may supported by a group of objects, as opposed to an implementation that may be shared by a group of objects. Some object analysis and design methodologies make this distinction as well. However, given the various usages, it's not a good idea to count on this distinction unless you know that everyone accepts it. It's not really set in concrete. You might want to look at the descriptions of "type and class" in various object models in the X3H7 Object Model Features Matrix (it's Section 7) at http://www.objs.com/x3h7/fmindex.htm --Frank ----------------------------------------------------------------------- Frank Manola www: http://www.objs.com Object Services and Consulting, Inc. email: fmanola@objs.com 151 Tremont Street #22R voice: 617 426 9287 Boston, MA 02111 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wperry at fiduciary.com Fri Apr 24 00:45:34 1998 From: wperry at fiduciary.com (W. E. Perry) Date: Mon Jun 7 17:00:43 2004 Subject: Semantics (was Re: Inheritance in XML [^*]) References: <3.0.32.19980423115844.00a1a790@pop.intergate.bc.ca> Message-ID: <353FC301.D7141B0B@fiduciary.com> Tim Bray wrote: > At 01:15 PM 4/23/98 -0400, Paul Prescod wrote: > >If XML had no semantics, then XSL, XLL and the DOM would have to > >explicitly describe the mapping from syntactic features to the abstract > >nodes that they work on. But they do not, because XML has semantic > >concepts like "element, "element type", "notation" and "attribute" that > >are *described by* the syntax. > > Well, we just have a difference of perception. I think that > "element", "element type", "notation", and so on are profoundly > *syntactic* constructs. I think an element is a piece of an XML > document that is bounded by tags; an entity is a chunk of > text that is either provided literally or referred to via URL. It > is true that the spec provides operational rather than purely grammatical > descriptions of some aspects of things, but that is largely for > convenience. Dan Connolly has argued repeatedly and forcefully that > the spec could be completely re-written to avoid discussion of the > processor's actions (he is right) and that this would be an improvement > (I'm not convinced). > > The fact that the XML processor has a couple of required *behaviors*, > most notably error handling, does not constitute anything like > what I think of in connection with the term "semantic". > > I suppose you can argue that declaration in DTDs do have a semantic > of grammatical constraint. OK, granted. > > But in the instance, Elements and attributes don't mean anything in and > of themselves. They doubtless have semantics that are used by humans and > computer programs in particular application domains, but that's none of > our business. > > And finally... words are only of use in facilitating human > communication when there is some shared understanding as to their > denotation and connotation. The term "semantic", judged by this standard, > has clearly and empirically lost its usefulness in this discussion. > > But of this I am confident: elements, attributes, and entities don't > mean anything in and of themselves. -Tim I agree with your clearly-conveyed preference to avoid endless discussion of what the XML definition might have been. However, we can simply agree to do just that, without impugning the usefulness of "semantic" for the core discussion of this list. Elements, entities and attributes do have meaning: they are, precisely, sememes. The fundamental work for which we have chosen XML as our tool is the definition--either by empirical derivation or by prescription--of the sememic content of the elements entities and attributes which populate our documents. It is such definitions which we publish in XML by means of semantically significant context. The specific forms of that contextuality and, by necessary implication, the semantics which it serves are the very stuff defined by the XML standard. In short, every instance of an XML tag is, precisely, a semanteme. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri Apr 24 01:41:49 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:00:43 2004 Subject: Semantics (was Re: Inheritance in XML [^*]) References: <3.0.32.19980423115844.00a1a790@pop.intergate.bc.ca> Message-ID: <353FD15E.F82@hiwaay.net> Tim Bray wrote: > > At 01:15 PM 4/23/98 -0400, Paul Prescod wrote: > >If XML had no semantics, then XSL, XLL and the DOM would have to > >explicitly describe the mapping from syntactic features to the abstract > >nodes that they work on. But they do not, because XML has semantic > >concepts like "element, "element type", "notation" and "attribute" that > >are *described by* the syntax. > > Well, we just have a difference of perception. Yes. OTOH... > Dan Connolly has argued repeatedly and forcefully that > the spec could be completely re-written to avoid discussion of the > processor's actions (he is right) and that this would be an improvement > (I'm not convinced). That a spec can be rewritten to to exclude information of use is not arguable. Yet contextually, the project was/is SGML On The WEB. There are two main requirements in that simple title: a. SGML (done) b. On The Web. b is a systemic requirement. Systemic requirements require semantic information. Removing those from the spec unlimits the spec, but I think makes them much less useful. Failure to do this for SGML systems made SGML less useful, and arguably, less attractive to implement. Why? Friendly to Information Maintenance but not very good for system interoperability. The industry did not get enough cohesion among systems to get community reinforced growth. No amplification. It depends on what one needs to improve. One can get hung up on information purity issues and not see the systemic problems of a standard framework. Isn't avoiding that mistake precisely what made HTML/HTTP work? > The fact that the XML processor has a couple of required *behaviors*, > most notably error handling, does not constitute anything like > what I think of in connection with the term "semantic". They are semantics/behaviors defined to make them useful. > But in the instance, Elements and attributes don't mean anything in and > of themselves. They doubtless have semantics that are used by humans and > computer programs in particular application domains, but that's none of > our business. Yes they do. They are meant to readable to humans and machines. This is a contractual semantic. The meaning is in the intent of the design. Say 'does this' and the semantic is there in the requirement. So? > And finally... words are only of use in facilitating human > communication when there is some shared understanding as to their > denotation and connotation. The term "semantic", judged by this standard, > has clearly and empirically lost its usefulness in this discussion. Hmmm... no. But the semantics of markup systems (not markup) are limited to the agreed upon meanings. You have contractual semantics (design requirements) and systemic semantics (environment where requirements are testable). > But of this I am confident: elements, attributes, and entities don't > mean anything in and of themselves. -Tim They mean what we agree to make them mean. Right now, we agree to make them readable to human systems and machine systems. XML is a system-centric standard. SGML is not. XSL, DOM, XLL are system applications. This does not mean they will not meet the loftier goals of information evolution, but they have to operate to do that. The pursuit of meaning is a human pursuit. Machines, like XML, don't care. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pax at webeasy.com Fri Apr 24 01:54:02 1998 From: pax at webeasy.com (Pax Prakarsa) Date: Mon Jun 7 17:00:43 2004 Subject: Architectural Form References: <353E1F70.92C39520@homeaccount.com> <199804222115.RAA00251@unready.microstar.com> Message-ID: <353FD48E.AE748352@webeasy.com> Hi Dave: This is about "architectural form". The example below is from your book "Structuring XML documents" Please help me on the following points. 1. Is there any architectural engine already implemented out there ? 2. If I want to build one myself, what does it take to build it ? I suppose I can use any XML parser. (or can I ? validating/non validating parser ?). and then build a module that is capable to determine which are the architectural elements ? 3. A client document as in your example: page 302-303
Title Author ... ...
... ...
The elements in the above example which are architectural elements are: , , and , while the other elements are 'native' elements defined within the DTD for that document. My question is: can this client document still be validated using the "derived/client DTD", even though it contains a mix of 'architectural' elements and 'native' elements ? Thank you in advance. Pax xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pax at webeasy.com Fri Apr 24 02:15:11 1998 From: pax at webeasy.com (Pax Prakarsa) Date: Mon Jun 7 17:00:43 2004 Subject: Architectural Form References: <353E1F70.92C39520@homeaccount.com> <199804222115.RAA00251@unready.microstar.com> <353FD48E.AE748352@webeasy.com> Message-ID: <353FD98B.907491B3@webeasy.com> I am also wondering, where I can find more information on architectural form ? Pax Prakarsa xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Apr 24 02:53:41 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:44 2004 Subject: Semantics (was Re: Inheritance in XML [^*]) In-Reply-To: <3.0.32.19980423115844.00a1a790@pop.intergate.bc.ca> References: <3.0.32.19980423115844.00a1a790@pop.intergate.bc.ca> Message-ID: <199804240050.UAA00361@unready.microstar.com> Tim Bray writes: > Well, we just have a difference of perception. I think that > "element", "element type", "notation", and so on are profoundly > *syntactic* constructs. It seems to me that semantics and syntax are fuzzy sets (like "tall" and "short") rather than crisp sets (like "greater than zero" or "less than zero"). In the SGML/XML world, we somehow know what we mean when we talk about "syntax" and "semantics", but as this discussion has shown, it's hard to quantify _how_ we know what we know, and in the end, it turns out that we have simply set an arbitrary boundary and silently agreed to enforce it. Both the location of that boundary and the very fact of its existence are also meaningful texts that some underpaid lecturer in cultural studies might want to pursue some day. If you all think that this is troubling, try reading Jacques Derrida on natural language and the act of writing (but please don't assume that I agree with him). Then, if you want to be reassured that the world is simple and quantifiable, go back and read some of Donald Knuth's friendly textbooks. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Fri Apr 24 02:57:48 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:00:44 2004 Subject: Nesting XML based languages and scripting languages Message-ID: <003e01bd6f1b$06f17710$2ee044c6@donpark> Jon, >I'm not at liberty to disclose the details of the XSL WG's current >design work. All I'm telling you is that you are not going to get a >very accurate idea of what XSL is capable of doing based on last >year's submission. You will get a much better picture in July. Let me first say that I do understand why details of XSL WG can not be disclosed. However, I have heard through the grapevine that there is a lot of XSL development activities at Microsoft with at least one project close to completion. My concern is this: are WG members allowed to share the details of WG activity with coworkers? If so, Microsoft and other members of the WG will have had 1 full year of lead time when the rest of us can see the next version of XSL spec. It is true that I have not contributed a dime to W3C while Microsoft and others have contributed significantly. What I am saying is this: closed WG like XSL WG should release interim specs more frequently (i.e. 3 months). I believe I have some probability of competing with a small MS team even if they were given 3 months of lead time (after all I do have the strength of ten men and can leap over reasonably tall structures after a good meal ;-). I do not believe I can if they have one year of lead time. Regards, Don Park http://www.docuverse.com/personal/index.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Apr 24 03:06:38 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:44 2004 Subject: SAX needs from our point of view In-Reply-To: References: Message-ID: <199804240103.VAA00414@unready.microstar.com> Michael Amster writes: > In our case, having embedded XML languages with our own language > controlling flow of execution, we have a real need for an accurate > reproduction of the XML elements parsed so they can be rewritten > correctly. SAX reports all elements, together with character data, ignorable whitespace, and processing instructions, so you won't lose anything there. > Specifically, the issue is important in distinguishing between text and > CDATA. Let me illustrate with a simple example: > > > > > This is data with &references; which should not be parsed! > ]]> > > This is just text > > > > > When this is reported up from a SAX parser, we do not differentiate between > text and the CDATA, but let's say that we want to output the subset of > arbitrary XML back out from our DOM or other object structure: > > > This is data with &references; which should not be parsed! > > This is just text > Your output routine is wrong: it should automatically escape all instances of '&', '<', and '>': This is data with &references; which should not be parsed! This is just text or even This is data with &references; which should not be parsed! This is just text > Now you see that the CDATA will have all references made when it is > reparsed. We really do want to preserve CDATA as different from > text in SAX. If there's a semantic attached to your use of CDATA, you should represent it with an element (which is guaranteed to make it through processing): Here is a listing: 1 < 2 There is no need for general XML processing tools _ever_ to know about CDATA sections; authoring and repository tools (including tools for authoring transforms) might want preserve them, but those fall out of the target audience for SAX level 1. Think of the analogy of C: the preprocessor takes care of surface things like macros and hides them from the compiler, which produces exactly the same object code for #define FOO 1 printf("%d", FOO + FOO); and printf("%d", 1 + 1); All the best, and thanks for the comments, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Fri Apr 24 03:22:41 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:00:44 2004 Subject: SAX needs from our point of view References: Message-ID: <353FEA6D.A1552448@infinet.com> Michael Amster wrote: > Quoting Ray Cromwell: > > >Ok, now that I've started a flame war and gotten that off my chest :), > >I'd like to nominate the three biggest features I'd like in SAX Level 2 > >(or SAX2.0), in order of importance. > >1) access to DTD information > >2) comments, CDATA, and location information for Attributes > >3) sax.util classes that take an ElementFactory (which return DOM > >interfaces), and build a tree. (maybe Don Park would like to contribute > >this). IBM's XML for Java is a starting point, but it has the fatal flaw > >that the return values of the ElementFactory are not the DOM interfaces > >(such as Element or PI) but IBM base classes, like TXElement or PI, > >which means you are forced to inherit from TXElement instead of just > >implementing Element. > > In our case, having embedded XML languages with our own language > controlling flow of execution, we have a real need for an accurate > reproduction of the XML elements parsed so they can be rewritten correctly. > Specifically, the issue is important in distinguishing between text and > CDATA. Let me illustrate with a simple example: > > > > > This is data with &references; which should not be parsed! > ]]> > > This is just text > > > > > When this is reported up from a SAX parser, we do not differentiate between > text and the CDATA, but let's say that we want to output the subset of > arbitrary XML back out from our DOM or other object structure: > > > This is data with &references; which should not be parsed! > > This is just text > > > Now you see that the CDATA will have all references made when it is > reparsed. We really do want to preserve CDATA as different from text in > SAX. I can live without comments and to some degree, I can even reduce the > amount of DTD info available to me, but I hope that CDATA and text are > reported differently through the interface. It should not substantially > complicate things for parser writers or application developers if it is > just a Document handler event. > > -MA The solution I have found for the XMLReader (formatter) I have been working on is to scan each string of character content for any characters that need to be escaped with a CDATA section and embed that content in a CDATA section. This operation algorithmically is sort of expensive, but for the content I have had to format, the formatting process is still 5-10 times faster than the parsing process. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Apr 24 03:36:15 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:44 2004 Subject: Nesting XML based languages and scripting languages Message-ID: <3.0.32.19980423183438.009274a0@pop.intergate.bc.ca> At 05:50 PM 4/23/98 -0700, Don Park wrote: >However, I have heard through the grapevine that there is a lot of XSL >development activities at Microsoft with at least one project close to >completion. Hmm; bear in mind that XSL is a year away, probably, from being a W3C Recommendation. Thus anybody who charges into implementation of a moving target is incurring some risk as well as potential benefit; at least one early-XML-adopter is right now feeling the pain of retrofitting case- sensitivity into running code with an installed base. Having said that, it's good that people charge ahead with implementation of pre-release specs because it tends to reveal problems that you just don't notice until you implement. And having said *that*, I think that anyone who implements a business-critical application based on an unfinished, unratified spec has shitferbrains. There have certainly been instances, inside W3C committees, where vendors have argued against changes on the grounds that they'd already implemented things; so far, such objections have generally failed to carry the day. On which subject, check out the "Status of this Document" language in the namespace working draft, elegantly authored by Ralph Swick and Ora Lassila for the RDF activity, and stolen by me for namespaces. Bottom line: yes, there are advantages to being inside the process, but early implementation is a two-edged sword. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From robin at ACADCOMP.SIL.ORG Fri Apr 24 03:39:30 1998 From: robin at ACADCOMP.SIL.ORG (Robin Cover) Date: Mon Jun 7 17:00:44 2004 Subject: Architectural Form Message-ID: <199804240144.UAA29842@ACADCOMP.SIL.ORG> > Date: Thu, 23 Apr 1998 17:15:08 -0700 > From: Pax Prakarsa > > I am also wondering, where I can find more information on > architectural form ? > > Pax Prakarsa See: http://www.sil.org/sgml/topics.html#archForms Architectural Forms and SGML/XML Architectures Where, without having counted list items, I suspect you will find a majority of the links are to mini-tutorials and promotionals by Eliot Kimber. Should be sufficient. -rcc ------------------------------------------------------------------------- Robin Cover Email: robin@acadcomp.sil.org 6634 Sarah Drive Dallas, TX 75236 USA >>> The SGML/XML Web Page <<< Tel: +1 (972) 296-1783 (h) http://www.sil.org/sgml/sgml.html Tel: +1 (972) 708-7346 (w) FAX: +1 (972) 708-7380 ========================================================================= xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Fri Apr 24 03:56:32 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:44 2004 Subject: Inheritance in XML [^*] References: <199804232204.PAA19888@boethius.eng.sun.com> Message-ID: <353FCC47.54122231@technologist.com> Jon Bosak wrote: > > The people who complain that XML does not have clearly > defined semantics mean "semantics" the way I do -- as a synonym for > "meaning". They are concerned that the meaning of a construction like > > Jones~Jane > > is not completely clear in the absence of a healthcare industry > standard like HL7 PID. What I was saying is that the meaning of such > a construction taken by itself is not just unclear, it really isn't > there at all. Well, here I am off being the theorist again. Just like you, I use semantics and meaning interchangably also (at least I think I do). If you've taught an XML course recently, I'll bet you find yourself saying: "this syntax *means* that..." (e.g. "The start tag means that an element has begun.") That's because there are several levels of meaning. You use this example: Jones~Jane At the XML Spec level the text models an abstraction like this: OBJECT-TYPE=Element TYPE-NAME="PATIENT_NAME" ELEMENT-ATTRIBUTES={} ELEMENT-CONTENT="Jones~Jane" At a level above that, (let's say HyTime), it might mean: OBJECT-TYPE=HyperLink LINK-TYPE=NameLoc TARGET-NAME="Jones~Jane" At a level above that, (let's say HL7) it might mean: OBJECT-TYPE=Patient Reference TARGET={ a particular referenced element } In a particular hospital, it might mean: OBJECT-TYPE=Patient Reference TARGET={ an object representing the database record described by the referenced element } I know the XML WG went to a lot of work to try and leave semantics out, but if we can agree that XML specifies the first level of meaning, then we agree that XML specifies base-level semantics. As long as we agree on that, we might as well make those semantics precise in the next iteration, as SGML will do in its next iteration (after 10 years of trying to get away without it). Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Apr 24 04:03:04 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:44 2004 Subject: Architectural Form In-Reply-To: <353FD48E.AE748352@webeasy.com> References: <353E1F70.92C39520@homeaccount.com> <199804222115.RAA00251@unready.microstar.com> <353FD48E.AE748352@webeasy.com> Message-ID: <199804240159.VAA00637@unready.microstar.com> Pax Prakarsa writes: > Hi Dave: > > This is about "architectural form". The example below is from your book > "Structuring XML documents" > > Please help me on the following points. > 1. Is there any architectural engine already implemented out there ? Yes: James Clark's SP set of tools (including the Jade DSSSL engine) have architectural support: http://www.jclark.com/ These don't yet, to my knowledge, support the PI-construct required for XML, but Eliot Kimber has just announced SP patches from ISOGEN that do just that. > 2. If I want to build one myself, what does it take to build it ? I > suppose I can use any XML parser. (or can I ? validating/non > validating parser ?). and then build a module that is capable to > determine which are the architectural elements ? It depends on how deeply you want to go. You can certainly hack together something very useful with SAX (I am considering doing so myself some day), but you need to get at the DTD to do complicated stuff like layered architectures or defaulted values for architectural attributes. > 3. A client document as in your example: page 302-303 > >
> Title > Author > ... > ... >
> > ... > ... > >
> The elements in the above example which are architectural elements > are: , , and , while the other elements are > 'native' elements defined within the DTD for that document. My > question is: can this client document still be validated using the > "derived/client DTD", even though it contains a mix of > 'architectural' elements and 'native' elements ? Yes, it can -- the start and end tags of the non-architectural elements will be ignored, and you can control what happens to their data and element content. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri Apr 24 04:17:30 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:00:44 2004 Subject: Open Standards Processes (WAS Re: Nesting XML based languages and scripting languages) References: <003e01bd6f1b$06f17710$2ee044c6@donpark> Message-ID: <353FF5D2.785@hiwaay.net> Don Park wrote: > > What I am saying is this: closed WG like XSL WG should release interim specs > more frequently (i.e. 3 months). I believe I have some probability of > competing with a small MS team even if they were given 3 months of lead time > (after all I do have the strength of ten men and can leap over reasonably > tall structures after a good meal ;-). I do not believe I can if they have > one year of lead time. This is an old gripe of mine about the XML process as conducted by this effort. It is not open. That has troubled me from the beginning because it is an open effort to replace an open standard, SGML, with a closed standard, XML. It is a horrible precedent even if a successful one. Because as demonstrated amply by WWW projects, running code does indeed out-colonize standards efforts, no one can deny the need for standards bodies to work with consortiums. It is now established practice. Still: o Consortia are responsible to their members: companies. o ISO is responsible to its members: countries. IMO, ISO must be the party that insists on and ensures openness because I do not think consortia can to the degree which will satisfy your real and legitimate complaint. VRML has a consortium, but the language standard is ISO. o Consortia and list/volunteer labor. Ensures systemic applicability. o ISO processes for drafting and approving authoritative language. Ensures contractual stability. o All drafts posted to the web at all times. Anyone can read and anyone can contribute. Only a few people edit and ISO makes the rules for these people, not the consortia. Ensures openness and "a level playing field". If people can't work inside that open a system, they should not be empowered to draft language, chair committees, or vote. Len Bullard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Fri Apr 24 04:35:09 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:45 2004 Subject: The Economist on Semantics Message-ID: <353FFA6B.752E6545@technologist.com> "Whereas HTML has a set lexicon of about 90 tags, XML has an infinite one: authors of XML documents can invent their own tags. The tag names, and what they mean, are left for the author to define depending on the subject matter. This sounds splendid?but it presents a problem for browsers such as Netscape Navigator and Internet Explorer, which will need somehow to interpret all of these new tags. Thus each XML document must be provided with an appendix, known as the Document Type Definition (DTD), a kind of glossary containing information on the nature of the document?s content, the tags used for various elements, as well as a listing of where in the document the tags occur and how they fit together." Now we know. All semantics go in the DTD. http://www.economist.com/editorial/freeforall/current/index_st4501.html?st.ne.fd.mnaw Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ray at guiworks.com Fri Apr 24 05:12:11 1998 From: ray at guiworks.com (Ray) Date: Mon Jun 7 17:00:45 2004 Subject: test Message-ID: <199804240311.VAA12570@coldsnap.guiworks.com> Sorry for this test message, but my posts appear to be bouncing. Anyone else getting Majordomo errors on line 340 in the digest code? :) -Ray xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elm at arbortext.com Fri Apr 24 05:24:16 1998 From: elm at arbortext.com (Eve L. Maler) Date: Mon Jun 7 17:00:45 2004 Subject: Nesting XML based languages and scripting languages In-Reply-To: <98Apr23.213718edt.26881@thicket.arbortext.com> Message-ID: <3.0.5.32.19980423232318.00bbe100@village.promanage-inc.com> At 09:34 PM 4/23/98 -0400, Tim Bray wrote: >At 05:50 PM 4/23/98 -0700, Don Park wrote: >>However, I have heard through the grapevine that there is a lot of XSL >>development activities at Microsoft with at least one project close to >>completion. > >Hmm; bear in mind that XSL is a year away, probably, from being a W3C >Recommendation. Thus anybody who charges into implementation of a moving >target is incurring some risk as well as potential benefit; at least one >early-XML-adopter is right now feeling the pain of retrofitting case- >sensitivity into running code with an installed base. And by the way, even though the XSL proposal was out there for a long while, the XSL W3C activity was chartered not too long ago, and they've only had a couple of meetings so far. I was pleased to see at WWW7 that they're actually working on a requirements document before doing things to the design -- this is a wonderful thing! Eve xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Fri Apr 24 05:54:38 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:00:45 2004 Subject: Semantics (was Re: Inheritance in XML [^*]) Message-ID: <005b01bd6f34$ee1bbe20$820b4ccb@NT.JELLIFFE.COM.AU> From: Paul Prescod >Let me suggest this definition for semantic: a mapping from a syntactic >feature to an abstraction. I think people get confused between the "semantics" of the markup language and the "semantics" of the element [type]s and attributes being marked-up. I'll note the dictionary meaning of "semantic" as "relating to connotation of words". But in markup languages it has taken on a rather more specialized meaning, unfortunately: "relating to the connotation of (particular) instances of markup or their content". When people say "XML is not semantic-free" they are using "semantics" to include "the connotation of the delimiters" as well. Maybe it serves everyone right for using jargon. So I think this is what Tim is saying with: >> I think that >> "element", "element type", "notation", and so on are profoundly >> *syntactic* constructs. I think an element is a piece of an XML >> document that is bounded by tags; They are "syntactic" in relation to the document being marked-up, but they are "semantic" in relation to the XML spec. It is because people make the transition between the two contexts that the term "semantic" becomes confusing. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Fri Apr 24 06:11:55 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:00:45 2004 Subject: Nesting XML based languages and scripting languages Message-ID: <001201bd6f36$2415fda0$2ee044c6@donpark> Ray, I like your suggestion of us having read-only access to W3C member mailing list. I find two-edged sword quite harmless if you have a handle on it. W3C members do. Rest of us don't. My hands are bleeding from holding the wrong end of too many two-edged swords with my barehands. Regards, Don Park www.shitferbrain.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Fri Apr 24 06:17:16 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:00:45 2004 Subject: Inheritance in XML [^*] Message-ID: <007001bd6f38$13da3080$820b4ccb@NT.JELLIFFE.COM.AU> From: Frank Manola >At 1:46 PM -0700 4/23/98, Lisa Rein wrote: >>> >>> What are two properties? Type and class >Whether there's a distinction between "type" and "class" depends on the >context. Loosely speaking, as well as in many contexts, they're generally >considered synonymous I think there is another consideration too. The word "type" has a specialized meaning in markup languages. The way it is typically used is to mean "a notation defined using some other notation". In other words, when I say "this entity is a GIF file" then I am giving its notation, but when I say "this entity is a string with described by the regular expression (abc)* " then I am giving its type. When OO programmers see "document type" or "element type" perhaps they expect there to be some type-system in place (whether inheritance, classes, mixins, or even basic typedef aliasing). But they are, to some extent, being mislead by the word "type". I don't think the SGML/XML literature makes this point explicit, unfortunately, but I think the usage is consistent. Similarly, people get confused between "notation" and "encoding". For example, if I take the ISO 8601 date 1998-05-31 and encode it in Base64 and put it in my XML document, then there are four notations. The date is in ISO 8601 notation, the 8601 notation is in Base64 notation, and the Base64 is in XML notation, and the XML is in the notation of whichever character set (i.e. the charset "encoding") is used. These are not "types" because each notation is not being defined using the previous notation: it is the instance which is being encoded through successive notations (notionallY). Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From greyno at mcs.com Fri Apr 24 06:37:38 1998 From: greyno at mcs.com (Gregg Reynolds) Date: Mon Jun 7 17:00:45 2004 Subject: Semantics (was Re: Inheritance in XML [^*]) References: <3.0.32.19980423074329.00b25c10@pop.intergate.bc.ca> <353F774F.69CB0F59@technologist.com> Message-ID: <3540096A.3A97@mcs.com> Paul Prescod wrote: > > If XML had no semantics, then XSL, XLL and the DOM would have to > explicitly describe the mapping from syntactic features to the abstract > nodes that they work on. But they do not, because XML has semantic > concepts like "element, "element type", "notation" and "attribute" that > are *described by* the syntax. > Paul - Some questions/comments. I find your posts wonderful food for thought, but I'm not always sure I understand them. And be advised I don't do smileys, prefering the challenge of conveying wit and humor in English (much harder and funner than coding), but I don't always win my battles with the language, so please be a generous reader. So, regarding the above quote: Replace "element", "element type", "notation", etc, with "foo", "bar", "baz", etc. Do you still have semantics? The English words used by the spec happen to have commonly understood "meanings" which, to my eye, color the discussion in unfortunate ways. And what exactly is the semantics of "semantic concepts like ... that are *described by* the syntax"? Isn't that begging the question a little bit? > Here's what a language with no semantics looks like: > > a -> b"q"c > a -> ca > b -> "d" > c -> "e" > > Even given a parse tree, you can't do anything interesting with this > language, because it has no semantics. "Colorless green ideas sleep furiously." Chomsky, late 50s or thereabouts. Adj adj n v adv. English has semantics. The quoted sentence has syntax, not semantics. Syntactically it's unremarkable. Semantically, it's pretty cool, but meaningless (in the ordinary sense of meaningless; in context it's profoundly meaningful, and even out of context it's not bad: try it out on somebody at your next cocktail party.) The fact that we can describe the sentence with "parts of speech" tags doesn't render it "meaningful". (Linguists: I'm vaguely aware that Chomsky's assertions about the fundamental difference between syntax and semantics has been challenged. True? Relevant?) Am I missing your point, Paul? The text of the spec may have stuff regarding *how* things are supposed to work, but the syntax of it looks to me to be completely devoid of implication. Unless, that is, one gets really abstract and declares, "Oh! your language has structure! Therefore it carries some sort of implicit meaning, reflecting your Weltanshauung, not mine! Down with the oppressor!" (Sorry, got carried away. But the similarity with late (political) Chomsky is striking all the same.) So I'm left wondering why we don't have formal definition for all this stuff. The editors of the standard look like a pretty impressive bunch, which leaves me all the more mystified as to why prose instead of a formal language. (I have in front of me a copy of "The Definition of Standard ML, of which I can make neither heads nor tails - but it looks *very* rigorous and formal.) Look at DSSSL: a wonderful, even beautiful, work of the imagination, written in . . . legalese? Do we really need "shall" in a technical spec? -- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Fri Apr 24 08:09:41 1998 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:00:45 2004 Subject: Nesting XML based languages and scripting languages Message-ID: <01bd6f5f$e8c765c0$d8addccf@uspppBckman> >Hmm; bear in mind that XSL is a year away, probably, from being a W3C >Recommendation. Thus anybody who charges into implementation of a moving >target is incurring some risk as well as potential benefit<< What about us poor authors!! We have to write "knowledgeably" about a subject that doesn't even exist. Our books usually appear at about the same time as a spec which invalidates every thing we have written!! Frank -----Original Message----- From: Tim Bray To: xml-dev@ic.ac.uk Date: Thursday, April 23, 1998 6:38 PM Subject: Re: Nesting XML based languages and scripting languages >At 05:50 PM 4/23/98 -0700, Don Park wrote: >>However, I have heard through the grapevine that there is a lot of XSL >>development activities at Microsoft with at least one project close to >>completion. > >Hmm; bear in mind that XSL is a year away, probably, from being a W3C >Recommendation. Thus anybody who charges into implementation of a moving >target is incurring some risk as well as potential benefit; at least one >early-XML-adopter is right now feeling the pain of retrofitting case- >sensitivity into running code with an installed base. > >Having said that, it's good that people charge ahead with implementation >of pre-release specs because it tends to reveal problems that you just >don't notice until you implement. And having said *that*, I think that >anyone who implements a business-critical application based on an >unfinished, unratified spec has shitferbrains. > >There have certainly been instances, inside W3C committees, where vendors >have argued against changes on the grounds that they'd already implemented >things; so far, such objections have generally failed to carry the day. > >On which subject, check out the "Status of this Document" language >in the namespace working draft, elegantly authored by Ralph Swick and >Ora Lassila for the RDF activity, and stolen by me for namespaces. > >Bottom line: yes, there are advantages to being inside the process, >but early implementation is a two-edged sword. -T. > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Fri Apr 24 08:09:46 1998 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:00:45 2004 Subject: Open Standards Processes (WAS Re: Nesting XML based languages and scripting languages) Message-ID: <01bd6f60$4e4db160$d8addccf@uspppBckman> Bravo!! Well said!!! Frank -----Original Message----- From: len bullard To: Don Park Cc: xml-dev@ic.ac.uk Date: Thursday, April 23, 1998 7:19 PM Subject: Open Standards Processes (WAS Re: Nesting XML based languages and scripting languages) >Don Park wrote: >> >> What I am saying is this: closed WG like XSL WG should release interim specs >> more frequently (i.e. 3 months). I believe I have some probability of >> competing with a small MS team even if they were given 3 months of lead time >> (after all I do have the strength of ten men and can leap over reasonably >> tall structures after a good meal ;-). I do not believe I can if they have >> one year of lead time. > >This is an old gripe of mine about the XML process as >conducted by this effort. It is not open. That has >troubled me from the beginning because it is an open >effort to replace an open standard, SGML, with a closed >standard, XML. It is a horrible precedent even if a successful >one. > >Because as demonstrated amply by WWW projects, running code >does indeed out-colonize standards efforts, no one can deny >the need for standards bodies to work with consortiums. It >is now established practice. Still: > >o Consortia are responsible to their members: companies. >o ISO is responsible to its members: countries. > >IMO, ISO must be the party that insists on and ensures >openness because I do not think consortia can to the >degree which will satisfy your real and legitimate >complaint. > >VRML has a consortium, but the language standard is ISO. > >o Consortia and list/volunteer labor. Ensures systemic > applicability. > >o ISO processes for drafting and approving authoritative > language. Ensures contractual stability. > >o All drafts posted to the web at all times. Anyone can > read and anyone can contribute. Only a few people edit > and ISO makes the rules for these people, not the consortia. > Ensures openness and "a level playing field". > >If people can't work inside that open a system, they should >not be empowered to draft language, chair committees, or vote. > >Len Bullard > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Fri Apr 24 08:39:50 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:00:45 2004 Subject: Nesting XML based languages and scripting languages In-Reply-To: <003e01bd6f1b$06f17710$2ee044c6@donpark> Message-ID: <199804240637.XAA20100@boethius.eng.sun.com> [Don Park:] | Let me first say that I do understand why details of XSL WG can not be | disclosed. | | However, I have heard through the grapevine that there is a lot of XSL | development activities at Microsoft with at least one project close to | completion. My concern is this: are WG members allowed to share the | details of WG activity with coworkers? Of course. An industry consortium's work belongs to the companies that fund the work. This is not new. | If so, Microsoft and other members of the WG will have had 1 full year | of lead time when the rest of us can see the next version of XSL spec. | It is true that I have not contributed a dime to W3C while Microsoft | and others have contributed significantly. Exactly. But you sure are in luck this time, because anyone who has implemented against what the submission looked like before the XSL WG formed in January is in for a peck of trouble. | What I am saying is this: closed WG like XSL WG should release interim | specs more frequently (i.e. 3 months). Tsk. You didn't read the release schedule at the URL I posted, did you? http://www.w3.org/Style/XSL Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Fri Apr 24 08:42:43 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:45 2004 Subject: test References: <199804240311.VAA12570@coldsnap.guiworks.com> Message-ID: <354033DD.93A0048@technologist.com> That happens when you post from the wrong account. Paul Prescod Ray wrote: > > Sorry for this test message, but my posts appear to be bouncing. > Anyone else getting Majordomo errors on line 340 in the digest code? > > :) > > -Ray > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) -- Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Fri Apr 24 09:13:44 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:45 2004 Subject: Semantics (was Re: Inheritance in XML [^*]) References: <005b01bd6f34$ee1bbe20$820b4ccb@NT.JELLIFFE.COM.AU> Message-ID: <35403BB6.3461050@technologist.com> Rick Jelliffe wrote: > > I think people get confused between the "semantics" of the markup language > and the "semantics" of the element [type]s and attributes being marked-up. That's right. One paper I read referred to "semantics" and "meta-semantics." That helps to clarify things somewhat, but it strikes me as the same kind of distinction as "language" and "meta-language." It is useful at first, but then when you try to come up with a precise definition you realize that there is none. HTML is a meta-language and XML is a language and vice versa. Similarly, as far as I can tell, what we today call a semantic could be a meta-semantic tomorrow, if it gets adopted in XML (e.g. the namespaces proposal). So perhaps the best thing is to talk about the semantics of the markup (as defined in the markup language) and the semantics of the marked up document (as defined in the documentation for the document type). Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Fri Apr 24 09:16:02 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:45 2004 Subject: Semantics (was Re: Inheritance in XML [^*]) References: <3.0.32.19980423074329.00b25c10@pop.intergate.bc.ca> <353F774F.69CB0F59@technologist.com> <3540096A.3A97@mcs.com> Message-ID: <35403BDD.2E8F85ED@technologist.com> Gregg Reynolds wrote: > > Replace "element", "element type", "notation", etc, with "foo", "bar", > "baz", etc. Do you still have semantics? The English words used by the > spec happen to have commonly understood "meanings" which, to my eye, > color the discussion in unfortunate ways. Good question. If the words were replaced with arbitrary strings of letters, I believe that the XML REC would still have semantics. > And what exactly is the > semantics of "semantic concepts like ... that are *described by* the > syntax"? Isn't that begging the question a little bit? It would be, except that everybody *depends on* these semantics. The body that "owns" XML (W3C) is producing other specs. left, right and center based on the described abstractions. For example, the DOM couldn't give two farts about the syntax of a document. It moves seamlessly between HTML syntax, XML syntax and could easily handle SGML (and probably or VRML, or even PDF) syntax too. It cares about the abstract structure -- the tree of attributed-elements described by an XML document. If XML has no semantics, then how can it describe an abstract tree? If it doesn't describe a tree, then what the heck is the DOM based on? So I'm convinced that the XML WG believes (unknowingly!) that XML has semantics even as they deny it. The concrete step that they could take to prove that I am wrong is to require the DOM to be defined in terms of XML's syntax instead of the tree abstraction. > "Colorless green ideas sleep furiously." Chomsky, late 50s or > thereabouts. Adj adj n v adv. > English has semantics. The quoted sentence has syntax, not semantics. Right: it doesn't mean anything. But an XML document does mean something: it is a linearization of an attributed element tree. If it can be interpreted as NOT a linearization of this abstraction, then the DOM rather falls apart. And it isn't just the DOM: XLL, MathML, SAX etc. have the same problem. They are all defined in terms of the abstraction, not the syntax. You can't both depend on the abstraction and claim it doesn't exist. > So I'm left wondering why we don't have formal definition for all this > stuff. The editors of the standard look like a pretty impressive bunch, > which leaves me all the more mystified as to why prose instead of a > formal language. It's a W3C standard. Look at HTML 4.0 and tell us about "prose." Unofficially, W3C standards are intended to be partially tutorials as well as specifications. On the other hand, Dan Connolly pushed harder for formality than anyone, so it is probably more the "web community" that drives this than the W3C staff. I think we could have done better in this particular area without going completely over to formal notation. As David M. pointed out to me in an off-line conversation, the REC is very explicit that a processor must pass whitespace to an application, but doesn't say that it must pass other character data along! I attribute this to a half-hearted attempt to leave semantics out. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecom.mixx.de Fri Apr 24 09:28:25 1998 From: James.Anderson at mecom.mixx.de (james anderson) Date: Mon Jun 7 17:00:45 2004 Subject: Inheritance in XML [->CoHC90] References: <199804230428.VAA19547@boethius.eng.sun.com> <353F4751.1E76D25F@mecom.mixx.de> <353F94A5.C9798BD1@technologist.com> <353FA88D.21E70F1D@finetuning.com> <353FB96D.96EF6E6@fiduciary.com> Message-ID: <35403F62.E880CE00@mecom.mixx.de> the referenced citation (in "Introduction-to-OO.html") is incomplete. the "ACM" should be taken to mean "17th ann acm symp on principles of programming languages, january 1990" ... W. E. Perry wrote: > > Paul Prescod dealt with this accurately and conclusively in his 20 Apr post: > > ... > http://progwww.vub.ac.be/prog/persons/kimmens/research/Introduction-to-OO.html > > The paper itself is called: "Inheritance is not subtyping" and is quite > famous, but unfortunately predates the Web. > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecom.mixx.de Fri Apr 24 09:35:26 1998 From: James.Anderson at mecom.mixx.de (james anderson) Date: Mon Jun 7 17:00:45 2004 Subject: Inheritance in XML [->CHC90] References: <199804230428.VAA19547@boethius.eng.sun.com> <353F4751.1E76D25F@mecom.mixx.de> <353F94A5.C9798BD1@technologist.com> <353FA88D.21E70F1D@finetuning.com> <353FB96D.96EF6E6@fiduciary.com> Message-ID: <354040A1.68A8D370@mecom.mixx.de> the referenced citation is incomplete. the "ACM" should be "17th ann acm symp on principles of programming languages, january 1990" (apologies, in advance, if this post is a duplicate) W. E. Perry wrote: > > ... > Paul Prescod dealt with this accurately and conclusively in his 20 Apr post: > ... > I've found a good reference to the 8 year old paper that made the > distinction between inheritance and subtyping most explicit. The paper > itself is not online, but this summary is quite good: ... > http://progwww.vub.ac.be/prog/persons/kimmens/research/Introduction-to-OO.html > > The paper itself is called: "Inheritance is not subtyping" and is quite > famous, but unfortunately predates the Web. > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Fri Apr 24 12:50:12 1998 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 17:00:45 2004 Subject: Semantics (was Re: Inheritance in XML [^*]) Message-ID: <199804241049.LAA29091@GPO.iol.ie> [David Meginnson] > >It seems to me that semantics and syntax are fuzzy sets (like "tall" >and "short") rather than crisp sets (like "greater than zero" or "less >than zero"). > >In the SGML/XML world, we somehow know what we mean when we talk about >"syntax" and "semantics", but as this discussion has shown, it's hard >to quantify _how_ we know what we know, and in the end, it turns out >that we have simply set an arbitrary boundary and silently agreed to >enforce it. Yes. Same probably goes for every other field that requires to give a syntactic form to an abstraction. Math for example. This comment is from section 1.4 of the MathML spec:- "The relationship between a mathematical notation and a mathematical idea is subtle and deep. On a formal level, the results of mathematical logic raise profound and unsettling questions about the correspondence between symbolic logic systems and the phenomena they model." I read this to mean that syntax and semantics are interwoven at a level of intricacy beyond most of us. Certainly beyond me:-) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Fri Apr 24 13:26:13 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:45 2004 Subject: Open Standards Processes (WAS Re: Nesting XML based languages and scripting languages) In-Reply-To: <01bd6f60$4e4db160$d8addccf@uspppBckman> References: <01bd6f60$4e4db160$d8addccf@uspppBckman> Message-ID: <199804241122.HAA00254@unready.microstar.com> Len Bullard writer: > >This is an old gripe of mine about the XML process as conducted by > >this effort. It is not open. That has troubled me from the > >beginning because it is an open effort to replace an open > >standard, SGML, with a closed standard, XML. It is a horrible > >precedent even if a successful one. Frank Boumphrey writes: > Bravo!! Well said!!! Can I download a free copy of the ISO 8879:1986 spec? All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Fri Apr 24 13:39:21 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:46 2004 Subject: XML-Data, "&" and inheritance Message-ID: <354079D3.DEB07DC9@technologist.com> In reviewing XML Data for another project, I note that the XML Data "subclass" mechanism depends on the XML-Data equivalent of the ampersand operator that was removed from XML. I'm not convinced that putting that operator back in was a good idea. It was left ouf of XML because it complicates implementation. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri Apr 24 14:00:35 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:46 2004 Subject: XML, MVC and Swing (was Re: better searching!!) In-Reply-To: <000301bd6dc6$0d7c6410$0101a8c0@darthvader.mm> References: <3.0.5.32.19980421091920.00b52bd0@scripting.com> Message-ID: <3.0.1.16.19980424114341.45d78b78@pop3.demon.co.uk> [crossposted to XML-DEV because I think the software aspects are relevant to current endeavours]. At 11:10 22/04/98 +0300, Alexander Evsukov wrote: [...] >I guess that this approach is also close to the Model/View/Controller >architecture. With application to XML, it looks like XML plays a role of a >Model (data to be displayed), XSL determines a View (how data should be >rendered) and XML browser acts as a Controller. Right? I think this is a useful analogy, though I'm not an expert on MVC. I have been cutting my teeth on SwingSet, Javasoft's classes for building graphical applications and these are based on MVC. [There is a lot to learn about!]. I have implemented JUMBO2 using SwingSet MVC and an example is the following: An XML document is tree-structured and/or eventStream-structured. If we take the first, we can read an XML document via SAX and build a tree using the Swing classes. I subclass DefaultMutableTreeNode (not as bad as it sounds!) to create an XNode which holds data for any ELEMENT. Each child is added by something like: XNode fooNode = new XNode("FOO"); XNode barNode = new XNode("BAR"); fooNode.add(barNode); and so on. This builds a tree (I also treat PIs and PCDATA as special subclasses of XNode). To display it I create a SAXTree (subclassed from JTree), with something like SAXTree tree = new SAXTree(fooNode); tree.display(); // display routine lifted from Swing examples Note that JTree is a View of a treeModel (fooNode and children). There could be several displays of the treeModel, including eventStream-based ones (haven't hacked that yet :-). [BTW the SwingSet text model says it is SGML-oriented, so well worth a look]. the SAXTree displays... If I click on "FOO" the standard behaviour is to display its children and the tree expands. This does not affect the model. If I delete the barNode from the treeModel, a signal is passed to the JTree which automatically updates the display to reflect the disappearance of the node from the model. If I had an eventstream-based view (e.g text) then it would be simultaneously updated). Very powerful. So I think your analogy is very close. There is a very strong mapping from XML to Java, and the SwingSet appears to take this even further. As I said, I'm still exploring this, but it's much better than writing all my own code (as in JUMBO1). Obviously I don't know how the stylesheets will be implemented (see Jon Bosak's announcement on XML-DEV that XSL will look very different from the last draft). What I do in JUMBO2 is allow some manually switchable heuristics for different ways of rendering. Thus an XML document can be rendered in a number of tree- and stream-based forms (although the full MVC power is not used yet). All JUMBO2 code will be publicly released in a few days if anyone is interested. Any communal help with learning Swingset would be appreciated :-) - there are 50 classes alone for managing Trees! P. > >--- >Sincerely, >Alexander Evsukov (a.k.a. Sanders) >M&M Data Systems, Ukraine >mailto:sanders@mmdata.kharkov.ua >< Standard Disclaimer > > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Fri Apr 24 14:01:38 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:46 2004 Subject: Open Standards Processes (WAS Re: Nesting XML based languages and scripting languages) References: <01bd6f60$4e4db160$d8addccf@uspppBckman> <199804241122.HAA00254@unready.microstar.com> Message-ID: <35407EEB.C093BB6A@technologist.com> David Megginson wrote: > > Can I download a free copy of the ISO 8879:1986 spec? No, but remember that the last four digits of that spec's name indicate that it predates the web by almost 10 years. You can get HyTime, DSSSL and newer SGML TCs on the Web. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri Apr 24 14:19:32 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:46 2004 Subject: Semantics In-Reply-To: <35403BDD.2E8F85ED@technologist.com> References: <3.0.32.19980423074329.00b25c10@pop.intergate.bc.ca> <353F774F.69CB0F59@technologist.com> <3540096A.3A97@mcs.com> Message-ID: <3.0.1.16.19980424112003.38ef0fe2@pop3.demon.co.uk> At 03:14 24/04/98 -0400, Paul Prescod wrote: [...] >based on the described abstractions. For example, the DOM couldn't give >two farts about the syntax of a document. It moves seamlessly between HTML ^^^^^ I assume this word is semantically void. >syntax, XML syntax and could easily handle SGML (and probably or VRML, or >even PDF) syntax too. It cares about the abstract structure -- the tree of >attributed-elements described by an XML document. If XML has no semantics, >then how can it describe an abstract tree? If it doesn't describe a tree, >then what the heck is the DOM based on? There seem to me to be at least two types of 'semantics'. One is what meaning is attached to , for which the XML community is trying to work out mechanisms. These include: - stylesheets for adding carbon to cellulose for humans. (I hope this is extended in the XSL process to include things other than paper and pseudo-paper rendering, such as interactive processes). - mapping to algorithms (Java classes, ECMAScript, etc.) - linking to human- or machine-readable resources such as glossaries and data dictionaries - architectural forms I think we all agree on the need to develop communal mechanism for this. The other type of semantics is how the words and phrases in the XML specs should be implemented in software or precise documents. The less clearly defined the semantics are, the more variation is possible when humans try to do this implementation. This is one reason why I have constantly raised the problem of implied semantics in the spec and urged that we develop software which implements them. Part of the reason why things like interfaces take so much effort is that it involves semantics as well as many other aspects. Whilst I suspect this has not been a major problem in SAX, it certainly is when we come to what an 'XML processor' and an 'XML application' do. For example, I doubt if we all agree on when a processor is required to validate a document, and what is done with the result. And I expect that we shall see an increasing number of commandline switches or menu items in XML software to allow humans to vary the semantics. For example, JUMBO2 reads and displays all whitespace. If there is no DTD then no whitespace is 'ignorable'. If an element's content consists only of whitespace then it's often 'obvious' to a human that this whitespace is not required for humans. IMO it would be wrong to remove this automatically, so I have a menu switch 'Delete Whitespace', which finds all such occurrences and - deletes them [either locally for elected nodes, or globally]. Wherever we can agree on and make available an algorithm that encodes our semantics (or offers a choice between acceptable views) this is worth highlighting and systematising for else we can descend into semantic incompatibility. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Fri Apr 24 15:22:19 1998 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 17:00:46 2004 Subject: Open Standards Processes (WAS Re: Nesting XML based languages and scripting languages) Message-ID: Len Bullard suggested: >o All drafts posted to the web at all times. Anyone can > read and anyone can contribute. Only a few people edit > and ISO makes the rules for these people, not the consortia. > Ensures openness and "a level playing field". Frank Boumphrey added: >What about us poor authors!! We have to write "knowledgeably" about a >subject that doesn't even exist. Our books usually appear at about the same >time as a spec which invalidates every thing we have written!! While I sympathize with everyone's impatience, and have lived Frank's 'poor authors' issue repeatedly, I would hesitate to change the XML process dramatically at this point. The discussions on this list in the past few days about 'semantics' alone have shown once again the kinds of rocks on which this kind of project may founder if it opens up too widely. XML-Dev would probably be a much louder list than it is if people felt their comments would have a direct impact on the standard, instead of the informal listening that (I think) does go on here. I'm not sure all of that loud would be useful or productive. This is a significant turnaround in my thinking. When I was working on XML: A Primer, I even briefly contemplated joining the W3C (as a sole proprietor, if necessary) before choking on the cost. That prior access to standards would have been very useful, and I'm surprised there aren't more publishers who are members. Waiting for new recommendations to arrive was something of an agony, especially as they were often barely announced or announced late. Reading the proposed recommendation and the W3C recommendation was especially scary, as the book was heading to the printer shortly after the PR arrived. Fortunately, it wasn't too bad. Looking back on last year, for all the damage it may have done to my stomach lining, I think the WG did an excellent job of announcing significant changes and releasing specs in a timely manner. The occasional announcements to this list, such as the one on case-sensitivity, made it possible to keep the book (my 'implementation') in sync with the standard throughout the development process. It's not 100% perfect (an RMD slipped through, and I couldn't get xml:lang or xml:space into the book in time - see http://members.aol.com/simonstl/xmlupdate if you need the fixes so far), but it still feels really good. The changes to Linking and XPointers will make Chapter 10 come unstuck, but I'm glad to see new activity. The other important tack I took was conservatism in my 'implementation'. With a title like 'A Primer', there was no need to cover every possibility. I gave XSL a sidebar and a general description, and focused more on CSS. The XML coverage in the book focuses on syntax, the core standard that is a lot more stable than the rest. I'd like very much to cover XSL in detail - but it'll have to wait for another edition, when the specification is much more stable. It was important to get the book out to the waiting crowds, but I think I managed to balance my publisher's promised ship dates (and mine) with the reality of XML's development. The W3C has (finally) gotten itself ahead of the implementors again. I'd much rather have them in the lead than trailing behind, as was the case for a number of years on the HTML standard. Staying ahead of the implementors is going to take a lot of discipline and (probably) a lot of closed-door discussion. For now at least, I think that's a reasonable price to pay for standards as useful as these. I don't think that process will last forever, but for now it seems sensible. Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From siping.liu at gte.com Fri Apr 24 15:34:56 1998 From: siping.liu at gte.com (Siping Liu) Date: Mon Jun 7 17:00:46 2004 Subject: Repost: DOM XML Level 1 question References: <353F8863.EAD6E79A@gte.com> Message-ID: <354094A6.C8C0A979@gte.com> Hi there, I posted this question yestoday and yet to get any reply. Here it goes again. If you know the answer please help. Siping Liu wrote: > Hi, > > I'm trying to implement the DOM interface on the listener side of the > SAX interface. The > DOM Core Level 1 interface looks clear enough to me. But I get confused > by the > definition in DOM XML Level 1, especially "XMLNode" -- why isn't it > derived from > "Note"? Can I traverse a tree with this interface and how? Why does it > have > this "getEntityReference()" method, I don't see the concept of having > an "entity reference" > attached to each node (such as an element) in XML spec. > > Thanks for your help. > Siping Liu > siping.liu@gte.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthew at praxis.cz Fri Apr 24 17:40:33 1998 From: matthew at praxis.cz (Matthew Gertner) Date: Mon Jun 7 17:00:46 2004 Subject: Open Standards Processes Message-ID: <01bd6f97$541355a0$020b0ac0@xerius> Simon, Your argument is convincing, but doesn't explain why open access is not given to works-in-progress for consultation by interested parties (i.e. read-only access). I appreciate the need of the W3C to avoid involving too many chefs in cooking up its standards, for exactly the reasons you mention. I also appreciate the need of the organization to finance its activities. However, the pricing scheme is pretty unfair. A company with $49 million in revenue can join as an affiliate member for about 0.01% of revenues (and the fee for full membership is pretty insignificant for the Microsofts and IBMs of the world), whereas for, say, a small Web startup in Prague the affiliate membership fee represents a few month's salary for the average programmer (life is cheap out here...). Anyone interested in setting up a corporation whose only purpose is to join the W3C and "hire" interested individuals for a reasonable fee? (evil :-). Matthew -----Original Message----- From: Simon St.Laurent To: xml-dev@ic.ac.uk Date: Friday, April 24, 1998 3:26 PM Subject: RE: Open Standards Processes (WAS Re: Nesting XML based languages and scripting languages) >Len Bullard suggested: > >>o All drafts posted to the web at all times. Anyone can >> read and anyone can contribute. Only a few people edit >> and ISO makes the rules for these people, not the consortia. >> Ensures openness and "a level playing field". > >Frank Boumphrey added: > >>What about us poor authors!! We have to write "knowledgeably" about a >>subject that doesn't even exist. Our books usually appear at about the same >>time as a spec which invalidates every thing we have written!! > >While I sympathize with everyone's impatience, and have lived Frank's 'poor >authors' issue repeatedly, I would hesitate to change the XML process >dramatically at this point. The discussions on this list in the past few days >about 'semantics' alone have shown once again the kinds of rocks on which this >kind of project may founder if it opens up too widely. XML-Dev would probably >be a much louder list than it is if people felt their comments would have a >direct impact on the standard, instead of the informal listening that (I >think) does go on here. I'm not sure all of that loud would be useful or >productive. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri Apr 24 19:27:12 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:46 2004 Subject: Open Standards Processes In-Reply-To: <01bd6f97$541355a0$020b0ac0@xerius> Message-ID: <3.0.1.16.19980424172648.5d879bee@pop3.demon.co.uk> [... in reply to the perceived concerns about the XML process...] I am not part of the W3C membership, though I am party to XML-SIG discussions, so hopefully can take a neutral stand on this. XML-DEV has no formal standing in the W3C process. When we set it up the intention (which still holds) was to provide a forum complementary to the XML-SIG in which implementations could be discussed *as part of the process of developing the protocols*. That has worked extremely well, IMO. There has been very high participation by the formal members of representatives of the W3C - it has been given freely without thought to commercial gain. Organisations join the W3C for enlightened self-interest - i.e. the financial *** and the staff investment *** is repaid by the returns. I cannot speak for the members but I assume that the ability to shape the specs and to know when and what will be formally announced is well worth the investment. I am afraid it's a fact of life that not everyone has the same opportunities. There are pluses and minuses to working in rich/powerful organisations. I sympathise that individuals may feel 'second-class' in the XML process and I hope that XML-DEV can go some way to reducing this feeling. The Internet is essentially my only connection with the real-life XML community. (I occasionally meet people who pass through London where I live). Like many of you I cannot afford the registration fee to go to Paris or the other XML meetings and so most of you are 'virtual friends'. But without the discussion lists we wouldn't have any contact. I am an enthusiast, and an idealist for much of the time. The Internet fuels those and very occasionally something wonderful happens, without money, without formal organisation. I've occasionally been part of this in virtual education (e.g. the Globewide Network Academy). Some of what has happened on this list is similar. But I know that in reality 99% of progress requires formality and funding. My own view is that the XML process is a very impressive and laudable activity in creativity and collaboration. What I value is that those who *are* part of the main XML community have given a great deal of their time on this list. I also feel that the views of individuals have almost always been listened to carefully and sensitively, in a way that is not very common in most 'standards' development processes. Without XML-DEV I and others would be greatly disadvantaged and I would not like to see it used to criticise the XML process. Cheers, P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri Apr 24 19:27:13 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:46 2004 Subject: LISTRIVIA (was Re: Repost: DOM XML Level 1 question) In-Reply-To: <354094A6.C8C0A979@gte.com> References: <353F8863.EAD6E79A@gte.com> Message-ID: <3.0.1.16.19980424164318.5757c23a@pop3.demon.co.uk> At 09:33 24/04/98 -0400, Siping Liu wrote: >Hi there, > >I posted this question yestoday and yet to get any reply. Here it goes again. >If >you know the answer please help. Hi there, Reposting questions to XML-DEV isn't going to help get answers quickly, I am afraid, and will annoy some people who might otherwise have replied. [Despite the rapidity of Web-based processes, a day isn't a very long time to wait.] All postings to this list are made voluntarily. People do this because they think they can help. There are lots of reasons why they may not reply to your question, such as: - they don't know the answer (that's why I haven't replied, and I suspect most others haven't). - they are not sure how to answer it (this may also be a factor) - their boss doesn't approve of them spending time on the Internet. - their partner/kids don't approve either. - they haven't had any sleep for the last 2 days... etc. If it helps you, people often don't reply to my questions either. This is probably because the Qs are too simple; been suggested/asked before; the Qs are too profound for the XML experts; I don't make them simple to understand; they think I'm crazy, etc. and they have bosses/kids/partners/cash_flow_crises, etc. You have to learn to live with it. Try again in 2 months? The percentage of postings on XML-DEV which receive useful replies is very high and we are all very grateful to everyone who takes time to reply. It often costs them time/social_life, etc. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Fri Apr 24 20:02:45 1998 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:00:46 2004 Subject: Open Standards Processes Message-ID: <01bd6fba$406987e0$31afdccf@uspppBckman> >Anyone interested in setting up a corporation whose only purpose is to join >the W3C and "hire" interested individuals for a reasonable fee? (evil :-). Now there's a thought. Anyone interested?(I'm serious)!! Frank -----Original Message----- From: Matthew Gertner To: xml-dev@ic.ac.uk Date: Friday, April 24, 1998 8:43 AM Subject: Re: Open Standards Processes >Simon, > >Your argument is convincing, but doesn't explain why open access is not >given to works-in-progress for consultation by interested parties (i.e. >read-only access). I appreciate the need of the W3C to avoid involving too >many chefs in cooking up its standards, for exactly the reasons you mention. >I also appreciate the need of the organization to finance its activities. >However, the pricing scheme is pretty unfair. A company with $49 million in >revenue can join as an affiliate member for about 0.01% of revenues (and the >fee for full membership is pretty insignificant for the Microsofts and IBMs >of the world), whereas for, say, a small Web startup in Prague the affiliate >membership fee represents a few month's salary for the average programmer >(life is cheap out here...). > >Anyone interested in setting up a corporation whose only purpose is to join >the W3C and "hire" interested individuals for a reasonable fee? (evil :-). > >Matthew > >-----Original Message----- >From: Simon St.Laurent >To: xml-dev@ic.ac.uk >Date: Friday, April 24, 1998 3:26 PM >Subject: RE: Open Standards Processes (WAS Re: Nesting XML based languages >and scripting languages) > > >>Len Bullard suggested: >> >>>o All drafts posted to the web at all times. Anyone can >>> read and anyone can contribute. Only a few people edit >>> and ISO makes the rules for these people, not the consortia. >>> Ensures openness and "a level playing field". >> >>Frank Boumphrey added: >> >>>What about us poor authors!! We have to write "knowledgeably" about a >>>subject that doesn't even exist. Our books usually appear at about the >same >>>time as a spec which invalidates every thing we have written!! >> >>While I sympathize with everyone's impatience, and have lived Frank's 'poor >>authors' issue repeatedly, I would hesitate to change the XML process >>dramatically at this point. The discussions on this list in the past few >days >>about 'semantics' alone have shown once again the kinds of rocks on which >this >>kind of project may founder if it opens up too widely. XML-Dev would >probably >>be a much louder list than it is if people felt their comments would have a >>direct impact on the standard, instead of the informal listening that (I >>think) does go on here. I'm not sure all of that loud would be useful or >>productive. > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pax at webeasy.com Fri Apr 24 20:09:43 1998 From: pax at webeasy.com (Pax Prakarsa) Date: Mon Jun 7 17:00:46 2004 Subject: DOM implementation References: <3.0.32.19980422223744.007207d0@postoffice.swbell.net> Message-ID: <3540D55A.2539C161@webeasy.com> Could any body tell me what other DOM implementation are available out there, other than: Don Park's SAXDOM and Data Channel's DXP. Does Data Channel's DXP actually use Don Park's SAXDOM ? Thanks Pax xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Fri Apr 24 20:18:21 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:00:46 2004 Subject: Open Standards Processes In-Reply-To: <01bd6fba$406987e0$31afdccf@uspppBckman> Message-ID: <199804241815.LAA20220@boethius.eng.sun.com> Speaking of standards and industry consortia... A lot of people subscribing to xml-dev might not be aware that there is already an industry consortium for people who develop tools for XML and related technologies. That organization is called OASIS (Organization for the Advancement of Structured Information Standards). OASIS has been active since 1993 under the name SGML Open. It recently changed its name in recognition of the much wider role that XML is going to be playing in the markup arena. (In fact, the name "XML Open" came in a close second to "OASIS" when alternative names were being considered. Sun was among the majority of members that felt that the "XML Open" label is too limiting for an organization devoted to product-independent document and data interchange in general.) The former SGML Open web site is being overhauled right now to rearrange material and incorporate the name change; you can check it out at http://oasis-open.org, but be prepared for a lot of broken links and a lot of stale references to SGML (just read "XML" wherever you see "SGML" and you will get the new thrust of the organization). OASIS is an established, well-organized industry consortium with a proven track record. It has done a lot of solid technical, educational, and marketing work over the years in service of interoperable open standards. It hosts Robin Cover's well-known SGML/XML web page, for example, and in the past has developed technical recommendations for table interoperability, document catalogs, and structured fragment interchange. It is now gearing up to establish conformance testing for XML applications. OASIS members include Adobe, ArborText, Chrystal, Ericsson, Folio, Fujitsu, Fuji Xerox, GCA, IBM, Inso, Novell, O'Reilly, SoftQuad, Sun, Texcel, Xyvision, and dozens of other companies and individuals. It holds regular technical meetings to discuss interoperability issues and supports its members with joint marketing events at major trade shows. The best part about OASIS for xml-dev subscribers is that individuals and small companies can participate for as little as $400 a year and can start to get some marketing support from the consortium for as little as $800 a year. If you consider yourself to be in the XML tools business, you should be aware that there is an existing industry consortium that can provide the marketing infrastructure needed to promote your commercial interests. The most important OASIS technical contributions have been in exactly the space addressed by some of the activities in this list -- defining protocols like the SAX interface that are necessary to complete and implement the basic standards. Impressive as it's been, I think that a lot of the technical work people are engaged in here could be carried out more effectively in the context of an established organization. And the resulting output from OASIS in the form of submissions to W3C would have a vastly greater effect than the suggestions of individuals in a mail list, no matter how well thought out those suggestions might be. In short, I think that the people who are distressed about their inability to participate in a consortium dominated by big companies should look into an already existing consortium that is designed specifically to develop interoperability protocols and would be happy to have their participation. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Fri Apr 24 20:21:22 1998 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:00:46 2004 Subject: Open Standards Processes Message-ID: <01bd6fc7$4cda6c80$31afdccf@uspppBckman> .>> Without XML-DEV I and others would be greatly disadvantaged and I would not like to see it used to criticise the XML process.<< I don't think any of us were criticizing the XML process, I think it works well, it would however for a variety of reasons be nice to know some thing of what is going on in the W3 Ivory halls!. For example we know that am XSL proposal is going to come out in July, we know that the are probably going to do away with HTML flow objects (too confusing), but what are they going to replace it with. I have a book coming out in June and I really have to write in some depth about XSL. I also have to use the msxsl processor for any examples, but by the time the book hits the market, it is probably going to be redundant and I will have to post numerous addenda to a web site!! I have "guessed" what direction XSL is going to take, but I could be (and probably am) wrong. Now someone from Microsoft is a technical reviewer of the book, and if he is nice he will tell me if I am way off, but then on the other hand he may not because officially he is not allowed to. It would certainly be advantageous to authors, and probably also to members of W3 if we were made privy at least in a partial way to the thought processes of the committee's. Frank -----Original Message----- From: Peter Murray-Rust To: xml-dev@ic.ac.uk Date: Friday, April 24, 1998 10:30 AM Subject: Re: Open Standards Processes >[... in reply to the perceived concerns about the XML process...] > >I am not part of the W3C membership, though I am party to XML-SIG >discussions, so hopefully can take a neutral stand on this. > >XML-DEV has no formal standing in the W3C process. When we set it up the >intention (which still holds) was to provide a forum complementary to the >XML-SIG in which implementations could be discussed *as part of the process >of developing the protocols*. That has worked extremely well, IMO. There >has been very high participation by the formal members of representatives >of the W3C - it has been given freely without thought to commercial gain. > >Organisations join the W3C for enlightened self-interest - i.e. the >financial *** and the staff investment *** is repaid by the returns. I >cannot speak for the members but I assume that the ability to shape the >specs and to know when and what will be formally announced is well worth >the investment. > >I am afraid it's a fact of life that not everyone has the same >opportunities. There are pluses and minuses to working in rich/powerful >organisations. I sympathise that individuals may feel 'second-class' in the >XML process and I hope that XML-DEV can go some way to reducing this >feeling. The Internet is essentially my only connection with the real-life >XML community. (I occasionally meet people who pass through London where I >live). Like many of you I cannot afford the registration fee to go to Paris >or the other XML meetings and so most of you are 'virtual friends'. But >without the discussion lists we wouldn't have any contact. > >I am an enthusiast, and an idealist for much of the time. The Internet >fuels those and very occasionally something wonderful happens, without >money, without formal organisation. I've occasionally been part of this in >virtual education (e.g. the Globewide Network Academy). Some of what has >happened on this list is similar. But I know that in reality 99% of >progress requires formality and funding. My own view is that the XML >process is a very impressive and laudable activity in creativity and >collaboration. > >What I value is that those who *are* part of the main XML community have >given a great deal of their time on this list. I also feel that the views >of individuals have almost always been listened to carefully and >sensitively, in a way that is not very common in most 'standards' >development processes. Without XML-DEV I and others would be greatly >disadvantaged and I would not like to see it used to criticise the XML >process. > > Cheers, > > P. > >Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic >net connection >VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary >http://www.venus.co.uk/vhg > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecom.mixx.de Fri Apr 24 20:25:07 1998 From: James.Anderson at mecom.mixx.de (james anderson) Date: Mon Jun 7 17:00:47 2004 Subject: Semantics [^*] References: <3.0.32.19980423074329.00b25c10@pop.intergate.bc.ca> <353F774F.69CB0F59@technologist.com> <3540096A.3A97@mcs.com> <35403BDD.2E8F85ED@technologist.com> Message-ID: <3540D90E.1BB7A5CB@mecom.mixx.de> i'd be happy with an operational/denotational definition for xml notation in terms the dom tree abstraction. maybe, while i'm at it, i'd even hope that it encompassed the concept 'valid'. (as a note to the proofreaders, although i've left the 's' word out here, i, as a rule, prefer unapproved etomological license to licensed grammatical transgression. despite what my oed's editors may have read.) Paul Prescod wrote: > ... > > So I'm convinced that the XML WG believes (unknowingly!) that XML has > semantics even as they deny it. The concrete step that they could take to > prove that I am wrong is to require the DOM to be defined in terms of > XML's syntax instead of the tree abstraction. > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Fri Apr 24 20:34:14 1998 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 17:00:47 2004 Subject: Open Standards Processes Message-ID: The W3C isn't always beautiful, but I think we'd better get back to XML discussion instead of W3C discussion before the list masters have to come out and whack us with small sticks. One of XML-Dev's best features is its ability to stay focused on _XML_. Unfortunately, I don't know of any good forums for discussing/pondering/plotting against/praising the W3C. Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Fri Apr 24 20:57:49 1998 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:00:47 2004 Subject: W3 membership was Re: Open Standards Processes Message-ID: <01bd6fcc$03ce38a0$31afdccf@uspppBckman> I am today setting up a not-for-profit legal entity in Ohio, which will be called (name to be decided pending a name search) .The books will be audited by Berwick Perlman and Mills a highly respected Cleveland legal firm. (Now there's an oxymoron). Membership will be limited to those who have a bone-fide interest in writing about Internet related topics. If there is enough interest we will apply for membership to the W3 organization as an affiliate. Affiliate membership is $5000/yr., and 3 yr. up front is required, so do the sums!! About 20 interested people would make it worth while, and cost up front about $800. (unfortunately there are costs to setting up the legal entity, which I have paid myself, but I would like to be refunded!!) Frank -----Original Message----- From: Frank Boumphrey To: Matthew Gertner ; xml-dev@ic.ac.uk Date: Friday, April 24, 1998 11:04 AM Subject: Re: Open Standards Processes >>Anyone interested in setting up a corporation whose only purpose is to join >>the W3C and "hire" interested individuals for a reasonable fee? (evil :-). > > >Now there's a thought. Anyone interested?(I'm serious)!! > >Frank > >-----Original Message----- >From: Matthew Gertner >To: xml-dev@ic.ac.uk >Date: Friday, April 24, 1998 8:43 AM >Subject: Re: Open Standards Processes > > >>Simon, >> >>Your argument is convincing, but doesn't explain why open access is not >>given to works-in-progress for consultation by interested parties (i.e. >>read-only access). I appreciate the need of the W3C to avoid involving too >>many chefs in cooking up its standards, for exactly the reasons you >mention. >>I also appreciate the need of the organization to finance its activities. >>However, the pricing scheme is pretty unfair. A company with $49 million in >>revenue can join as an affiliate member for about 0.01% of revenues (and >the >>fee for full membership is pretty insignificant for the Microsofts and IBMs >>of the world), whereas for, say, a small Web startup in Prague the >affiliate >>membership fee represents a few month's salary for the average programmer >>(life is cheap out here...). >> >>Anyone interested in setting up a corporation whose only purpose is to join >>the W3C and "hire" interested individuals for a reasonable fee? (evil :-). >> >>Matthew >> >>-----Original Message----- >>From: Simon St.Laurent >>To: xml-dev@ic.ac.uk >>Date: Friday, April 24, 1998 3:26 PM >>Subject: RE: Open Standards Processes (WAS Re: Nesting XML based languages >>and scripting languages) >> >> >>>Len Bullard suggested: >>> >>>>o All drafts posted to the web at all times. Anyone can >>>> read and anyone can contribute. Only a few people edit >>>> and ISO makes the rules for these people, not the consortia. >>>> Ensures openness and "a level playing field". >>> >>>Frank Boumphrey added: >>> >>>>What about us poor authors!! We have to write "knowledgeably" about a >>>>subject that doesn't even exist. Our books usually appear at about the >>same >>>>time as a spec which invalidates every thing we have written!! >>> >>>While I sympathize with everyone's impatience, and have lived Frank's >'poor >>>authors' issue repeatedly, I would hesitate to change the XML process >>>dramatically at this point. The discussions on this list in the past few >>days >>>about 'semantics' alone have shown once again the kinds of rocks on which >>this >>>kind of project may founder if it opens up too widely. XML-Dev would >>probably >>>be a much louder list than it is if people felt their comments would have >a >>>direct impact on the standard, instead of the informal listening that (I >>>think) does go on here. I'm not sure all of that loud would be useful or >>>productive. >> >> >> >>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >>(un)subscribe xml-dev >>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following >message; >>subscribe xml-dev-digest >>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) >> >> > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Fri Apr 24 23:03:42 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:00:47 2004 Subject: DOM implementation Message-ID: <002701bd6fc3$79cb89d0$2ee044c6@arcot-main> Pax Wrote: >Could any body tell me what other DOM implementation are available out there, >other than: >Don Park's SAXDOM and Data Channel's DXP. > >Does Data Channel's DXP actually use Don Park's SAXDOM ? IBM's XML Parser package has some DOM support. Data Channel started with SAXDOM but they couldn't wait for SAX2 to get DTD support. They probably rewrote all if not most of the code to fuse directly with their own parser and to support XML part of the DOM spec. As I have told DataChannel, you are free to take SAXDOM code, change it and rename it anyway you like, include it in your product, give it out as Christmas presents, whatever. You do not have to mention my name nor SAXDOM. Perhaps I should have named it FREEDOM!;-p. BTW, it looks like the next release of SAXDOM will be slightly delayed until SAX is finalized. Sorry if this affects your plans, Don Park http://www.docuverse.com/personal/index.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Fri Apr 24 23:36:56 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:00:47 2004 Subject: XML-DEVIL Proposal - was Open Standards Processes Message-ID: <007001bd6fc7$5d1f4f70$2ee044c6@arcot-main> >>Anyone interested in setting up a corporation whose only purpose is to join >>the W3C and "hire" interested individuals for a reasonable fee? (evil :-). > > >Now there's a thought. Anyone interested?(I'm serious)!! I was thinking along the same line last night and here we are. I am all for it if it is acceptable to W3C. We could call it XML-DEVIL (XML-DEV Independent Logheads) which could be either a non-profit corporation or an association. I am sure we can put our piggy banks together and come up with the necessary fee. What do you say, XML-Devils?;-) Don Park http://www.docuverse.com/personal/index.html PS: Everyone please note that this is not an attempt to undermine W3C process at all. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Fri Apr 24 23:58:20 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:00:47 2004 Subject: Repost: DOM XML Level 1 question Message-ID: <00a001bd6fcb$20481100$2ee044c6@arcot-main> Siping, >> I'm trying to implement the DOM interface on the listener side of the >> SAX interface. The >> DOM Core Level 1 interface looks clear enough to me. But I get confused >> by the >> definition in DOM XML Level 1, especially "XMLNode" -- why isn't it >> derived from >> "Note"? Can I traverse a tree with this interface and how? Why does it >> have >> this "getEntityReference()" method, I don't see the concept of having >> an "entity reference" >> attached to each node (such as an element) in XML spec. XMLNode is probably not derived from Node because there was no need to. You traverse via its methods which allows you to get its parent, siblings, and children. EntityReference, when expanded, could result in multiple XMLNodes, so each of the XMLNodes in the expanded view will return the original EntityReference if getEntityReference is invoked. Like wise, getEntityDeclaration will return the definition of the entity. If I am vague it is because I have little details myself. Good luck, Don Park http://www.docuverse.com/personal/index.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From RMcDouga at JetForm.com Sat Apr 25 00:31:39 1998 From: RMcDouga at JetForm.com (Rob McDougall) Date: Mon Jun 7 17:00:47 2004 Subject: Case usage in element names, attribute names & attribute values Message-ID: >I've been looking through many of the established XML grammars, and am left >with a question. Why do all the grammars appear to be in lower-case? Since >xml has become case-sensitive, it would seem to me that people would adopt >one of the popular conventions used in other case-sensitive applications like >programming languages. Why haven't I seen any grammars that look like this: > > > >If someone were to create a grammar that mixed upper-case and lower-case >would they have trouble getting people to adopt it? Is there a compelling >reason to reject a mixed case grammar. To my eye, mixed case is more >attractive. It's more like natural English. > >So, what are the opinions on each of: >- tag names (lowercase, uppercase, mixed-case) >- attribute names (lowercase, uppercase, mixed-case) >- enumerated attribute values (lowercase, uppercase, mixed-case) > >And with each option, are there opinions on any punctuation? Hyphens, >underscores, nothing? > >Thanks in advance, > >Rob > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Sat Apr 25 01:09:37 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:00:47 2004 Subject: XML-DEVIL Proposal - was Open Standards Processes In-Reply-To: <007001bd6fc7$5d1f4f70$2ee044c6@arcot-main> (donpark@quake.net) Message-ID: <199804242307.QAA20324@boethius.eng.sun.com> [Don Park:] | I am all for it if it is acceptable to W3C. We could call it XML-DEVIL | (XML-DEV Independent Logheads) which could be either a non-profit | corporation or an association. I am sure we can put our piggy banks | together and come up with the necessary fee. | | What do you say, XML-Devils?;-) Cute idea, but it won't accomplish what you want. You would have to incorporate and set up legal employment contracts with all your "employees", and then all the employees of the corporation would be legally bound to respect the confidentiality of the W3C work in progress just like the employees of the current W3C members are. So there goes your free and open public discussion. And you could say goodbye to the right to speculate in newsgroups or public lists like xml-dev. Such a group might be useful for lobbying purposes, since it would give its members a voice on the W3C Advisory Council. But the AC is exactly what its name says -- advisory. In the W3C, the Director is ultimately responsible for all decisions made by the consortium, regardless of the input of its member organizations. And for the right to be one of over 200 organizations giving the Director advice, you would have to come up with the W3C money, manage the corporation as a corporation, set up and maintain a mailing list just like xml-dev (only restricted), and monitor the activities of each and every one of the "employees" to make sure that they followed W3C confidentiality guidelines. I can fairly well guarantee that anyone who took this on would never have time to write another line of code. I'm going to bet that the majority of people interested enough to pursue this idea are planning to make and sell XML tools. In other words, planning to compete with each other. If that's the case, what you really want is either to join the W3C on an individual basis (I haven't checked lately, but last I heard this was a minimum of $5000 up front), or join OASIS and in effect form your own consortium. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Sat Apr 25 01:20:00 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:00:47 2004 Subject: Terminological correction Message-ID: <199804242317.QAA20343@boethius.eng.sun.com> I slipped (as I often do) and used the term "W3C Advisory Council". It's actually "Advisory Committee". It was called "Advisory Council" very early in its existence, and that unfortunately is the name that has stuck in my mind ever since. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Sat Apr 25 01:44:26 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:00:47 2004 Subject: Open Standards Processes (WAS Re: Nesting XML based languages and scripting languages) References: <01bd6f60$4e4db160$d8addccf@uspppBckman> <199804241122.HAA00254@unready.microstar.com> Message-ID: <3541236C.FDF@hiwaay.net> David Megginson wrote: > > Len Bullard writer: > > > >This is an old gripe of mine about the XML process as conducted by > > >this effort. It is not open. That has troubled me from the > > >beginning because it is an open effort to replace an open > > >standard, SGML, with a closed standard, XML. It is a horrible > > >precedent even if a successful one. > > Frank Boumphrey writes: > > > Bravo!! Well said!!! > > Can I download a free copy of the ISO 8879:1986 spec? > > All the best, Good question and valid. It has also been a subject of complaints for years. OTOH, up to the point of publication, most of the work which contributes to it is maintained on a freely accessible site. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Sat Apr 25 02:01:06 1998 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:00:47 2004 Subject: Case usage in element names, attribute names & attribute values Message-ID: <01bd6ff6$f90a9460$31afdccf@uspppBckman> Personally I prefer all upper or all lower case. I forever seem to be forgetting to capitalize a letter in camelback notation!! Frank -----Original Message----- From: Rob McDougall To: 'xsl-list@mulberrytech.com' Date: Friday, April 24, 1998 3:27 PM Subject: Case usage in element names, attribute names & attribute values >I've been looking through many of the established XML grammars, and am >left with a question. Why do all the grammars appear to be in >lower-case? Since xml has become case-sensitive, it would seem to me >that people would adopt one of the popular conventions used in other >case-sensitive applications like programming languages. Why haven't I >seen any grammars that look like this: > > > >If someone were to create a grammar that mixed upper-case and lower-case >would they have trouble getting people to adopt it? Is there a >compelling reason to reject a mixed case grammar. To my eye, mixed case >is more attractive. It's more like natural English. > >So, what are the opinions on each of: >- tag names (lowercase, uppercase, mixed-case) >- attribute names (lowercase, uppercase, mixed-case) >- enumerated attribute values (lowercase, uppercase, mixed-case) > >And with each option, are there opinions on any punctuation? Hyphens, >underscores, nothing? > >Thanks in advance, > >Rob > > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Sat Apr 25 02:04:40 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:00:47 2004 Subject: XML-DEVIL Proposal - was Open Standards Processes References: <199804242307.QAA20324@boethius.eng.sun.com> Message-ID: <35412803.5328@hiwaay.net> Jon Bosak wrote: > Cute idea, but it won't accomplish what you want. > Such a group might be useful for lobbying purposes, since it would > give its members a voice on the W3C Advisory Council. But the AC is > exactly what its name says -- advisory. In the W3C, the Director is > ultimately responsible for all decisions made by the consortium, > regardless of the input of its member organizations. Wow! The information technologies of the free world are subject to the whims of one individual. Fidel and Bill relax. You are no longer the only dictators left in the Western hemisphere. How strange an outcome for those who plead so long and so loud for freedom of information, freedom of products, open markets and all of that. I guess Orwell is right again. Go ISO. Go SGML. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Sat Apr 25 02:07:21 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:00:47 2004 Subject: XML-DEVIL Proposal - was Open Standards Processes Message-ID: <001801bd6fdc$cc8d1c10$2ee044c6@arcot-main> >Cute idea, but it won't accomplish what you want. You would have to >incorporate and set up legal employment contracts with all your >"employees", and then all the employees of the corporation would be >legally bound to respect the confidentiality of the W3C work in >progress just like the employees of the current W3C members are. So >there goes your free and open public discussion. And you could say >goodbye to the right to speculate in newsgroups or public lists like >xml-dev. If I understand the requirements correctly, all types of organizations are allowed to be a member of W3C. It includes commercial, educational, government agencies. Nothing indicates that it excludes industry association. Having to become an employee of XML-Devil Inc. seems like a subversion and I do not feel right doing that. I would like to be able to do this without having to sell myself out. I was not planning to have open public discussions, XML-Devils will have our own closed mailing list where we will discuss the issues and submit comments and proposals to W3C. Anyone breaking the confidentiality agreement will lose the membership. One of us will be elected to represent us in the W3C Advisory Committee. What we, the independent XML developers, offer could be what W3C needs to remain truely vendor-neutral. Netscape, JavaSoft, and others can probably balance Microsoft within W3C but there is currently nothing except good judgement to balance the needs of large corporations and small developers. I do not like being dependent on other people's judgement. I do not want to be a bystander while big boys play politics on the supposedly level ground. What we are trying to do will not have negative impact on W3C activities. We are all professional developers and we will strive to be an integral part of W3C. It would be best if W3C offered special membership to a self-governing group of independent developers. Meanwhile, I will take a look at what OASIS offers as you suggested. Thank you for taking the time to respond in length. Regards, Don Park http://www.docuverse.com/personal/index.html PS: I would like to change the proposed name to XML-DEV Independent Lobby rather than LogHead which is similar in concept but less professional :-). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Sat Apr 25 02:08:32 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:00:47 2004 Subject: XML-DEVIL Proposal - was Open Standards Processes Message-ID: <001b01bd6fdc$f0d7ba80$2ee044c6@arcot-main> >Cute idea, but it won't accomplish what you want. You would have to >incorporate and set up legal employment contracts with all your >"employees", and then all the employees of the corporation would be >legally bound to respect the confidentiality of the W3C work in >progress just like the employees of the current W3C members are. So >there goes your free and open public discussion. And you could say >goodbye to the right to speculate in newsgroups or public lists like >xml-dev. If I understand the requirements correctly, all types of organizations are allowed to be a member of W3C. It includes commercial, educational, government agencies. Nothing indicates that it excludes industry association. Having to become an employee of XML-Devil Inc. seems like a subversion and I do not feel right doing that. I would like to be able to do this without having to sell myself out. I was not planning to have open public discussions, XML-Devils will have our own closed mailing list where we will discuss the issues and submit comments and proposals to W3C. Anyone breaking the confidentiality agreement will lose the membership. One of us will be elected to represent us in the W3C Advisory Committee. What we, the independent XML developers, offer could be what W3C needs to remain truely vendor-neutral. Netscape, JavaSoft, and others can probably balance Microsoft within W3C but there is currently nothing except good judgement to balance the needs of large corporations and small developers. I do not like being dependent on other people's judgement. I do not want to be a bystander while big boys play politics on the supposedly level ground. What we are trying to do will not have negative impact on W3C activities. We are all professional developers and we will strive to be an integral part of W3C. It would be best if W3C offered special membership to a self-governing group of independent developers. Meanwhile, I will take a look at what OASIS offers as you suggested. Thank you for taking the time to respond in length. To XML-DEV list members: I appologize that I have accidentally touched a raw nerve and ended up starting a flame/crusade of sort. We will be winding down the discussions here soon so please have patience. Sorry, Peter. Regards, Don Park http://www.docuverse.com/personal/index.html PS: I would like to change the proposed name to XML-DEV Independent Lobby rather than LogHead which is similar in concept but less professional :-). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Sat Apr 25 02:53:37 1998 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:00:47 2004 Subject: XML-DEVIL Proposal - was Open Standards Processes Message-ID: <01bd6ffd$fe021400$31afdccf@uspppBckman> Unless the Membership elegibility has changed not-for-profit organisations are allowed to apply for membership. Although the rights and privilages of membership are limited to paid employees, according to the Affiliate member agreement and Appendix1, there is nothing to prevent a paid employee being a spokesman of the group. Certainly no members of the group should air their confidential knowledge on a public forum, but there are several W3 members who contribute regularly to the W3 forums, Jon Bosak, Dave Ragget, Chris Wilson to name a few. It is clear that members or sponsors of a consortium cannot attend meetings etc. but they can send paid employees to meetings on their behalf. The idea behind a process such as XML-Devil is not to take over the organization, but to have input into the W3 decision making process. >> set up and maintain a mailing list just like xml-dev (only restricted), and monitor the activities of each and every one of the "employees" to make sure that they followed W3C confidentiality guidelines<< Many of us do this already and have ample time for other activities!! We are not talking about a 10,000 member organization here!! rank -----Original Message----- From: Don Park To: xml-dev@ic.ac.uk Date: Friday, April 24, 1998 5:10 PM Subject: Re: XML-DEVIL Proposal - was Open Standards Processes >>Cute idea, but it won't accomplish what you want. You would have to >>incorporate and set up legal employment contracts with all your >>"employees", and then all the employees of the corporation would be >>legally bound to respect the confidentiality of the W3C work in >>progress just like the employees of the current W3C members are. So >>there goes your free and open public discussion. And you could say >>goodbye to the right to speculate in newsgroups or public lists like >>xml-dev. > >If I understand the requirements correctly, all types of organizations are >allowed to be a member of W3C. It includes commercial, educational, >government agencies. Nothing indicates that it excludes industry >association. Having to become an employee of XML-Devil Inc. seems like a >subversion and I do not feel right doing that. I would like to be able to >do this without having to sell myself out. > >I was not planning to have open public discussions, XML-Devils will have our >own closed mailing list where we will discuss the issues and submit comments >and proposals to W3C. Anyone breaking the confidentiality agreement will >lose the membership. One of us will be elected to represent us in the W3C >Advisory Committee. > >What we, the independent XML developers, offer could be what W3C needs to >remain truely vendor-neutral. Netscape, JavaSoft, and others can probably >balance Microsoft within W3C but there is currently nothing except good >judgement to balance the needs of large corporations and small developers. >I do not like being dependent on other people's judgement. I do not want to >be a bystander while big boys play politics on the supposedly level ground. > >What we are trying to do will not have negative impact on W3C activities. >We are all professional developers and we will strive to be an integral part >of W3C. It would be best if W3C offered special membership to a >self-governing group of independent developers. > >Meanwhile, I will take a look at what OASIS offers as you suggested. Thank >you for taking the time to respond in length. > >Regards, > >Don Park >http://www.docuverse.com/personal/index.html > >PS: I would like to change the proposed name to XML-DEV Independent Lobby >rather than LogHead which is similar in concept but less professional :-). > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Sat Apr 25 04:51:36 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:00:48 2004 Subject: XML-DEVIL Proposal - was Open Standards Processes In-Reply-To: <001801bd6fdc$cc8d1c10$2ee044c6@arcot-main> (donpark@quake.net) Message-ID: <199804250249.TAA20386@boethius.eng.sun.com> [Don Park:] | If I understand the requirements correctly, all types of organizations | are allowed to be a member of W3C. It includes commercial, | educational, government agencies. Nothing indicates that it excludes | industry association. Having to become an employee of XML-Devil | Inc. seems like a subversion and I do not feel right doing that. I | would like to be able to do this without having to sell myself out. Sorry, I wasn't being terribly careful about my wording. All I meant was that participation in W3C entails a contract of confidentiality that is legally propagated somehow down to the members of the organization. I'm not a lawyer and I don't know exactly how this works in detail for non-profits, etc. All I can tell you from experience is that (a) somebody at the top of the member organization is somehow going to end up being legally responsible for the activities of the people making up the organization, (b) someone at the top therefore has to keep an eye on those people to make sure that they don't do something wrong, and (c) one of the effects of the need to maintain confidentiality is that you can't comment in public forums on anything that happens to be under discussion in any of the multiple W3C activities that you *might* know about (i.e., all of them). Please understand that I'm not arguing against your idea, which has occurred to others before you and undeniably has a certain amount of charm. I'm just trying to tell you from personal experience that maintaining the confidentiality requirement is an enormous pain in the ass and will most definitely crimp your style. | I was not planning to have open public discussions, XML-Devils will | have our own closed mailing list where we will discuss the issues and | submit comments and proposals to W3C. Anyone breaking the | confidentiality agreement will lose the membership. One of us will be | elected to represent us in the W3C Advisory Committee. Right. And one of the things I was trying to say more subtly before but apparently need to be more direct about is that you are imagining a group of potential competitors (rather than the employees of a single corporation) deciding which *one* of them is going to represent the group to the W3C Director. What fun you'll have with that part! And what a wonderful time you'll have specifying in detail the procedures you will follow to determine what position that one person represents on each issue! Not to mention the procedures for deciding who gets to be on which working group... | What we, the independent XML developers, offer could be what W3C needs | to remain truely vendor-neutral. Netscape, JavaSoft, and others can | probably balance Microsoft within W3C but there is currently nothing | except good judgement to balance the needs of large corporations and | small developers. I do not like being dependent on other people's | judgement. I do not want to be a bystander while big boys play | politics on the supposedly level ground. I completely understand and sympathize with this view. All I'm trying to say is that I believe you can accomplish more with less effort by utilizing an existing corporate infrastructure (including web site, mailing lists, board of directors, marketing staff, industry contacts, conference schedule, and the rest of the apparatus that you will eventually need) that is capable right now of serving exactly the function that you describe. | Meanwhile, I will take a look at what OASIS offers as you suggested. You should talk to the Executive Director, Robin Tomlin: tomlin@oasis-open.org Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sat Apr 25 14:32:18 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:48 2004 Subject: The Web Message-ID: <3541D7D6.DA3EC24D@technologist.com> A couple of people have asked me about my counting of years between the SGML standard being published and the Web. According to the Internet timeline,[1] the Web was released in 1992, which is only six years after SGML was published, but in my recollection, it did not even begin to replace Gopher as a popular information system (even among hackers) until Mosaic was released which IIRC was in 1994. I certainly spent most of my time on the Internet in 1993 without hearing much, if anything, about the Web. Anyhow, the relevant point is that ISO standards in the SGML family that have been developed since the Web became popular are available online. The world changed and ISO changed with it. [1]http://www-personal.umd.umich.edu/~nhughes/htmldocs/timeline.html Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Sat Apr 25 16:50:35 1998 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 17:00:48 2004 Subject: Case usage in element names, attribute names & attribute values Message-ID: >If someone were to create a grammar that mixed upper-case and lower-case >would they have trouble getting people to adopt it? Is there a compelling >reason to reject a mixed case grammar. To my eye, mixed case is more >attractive. It's more like natural English. I used all upper-case element and attribute names in XML: A Primer (and do so generally) because it's a lot easier to separate markup from content visually that way. Mixed case is more attractive if your document is really about the markup, but I tend to be more interested in the content. Since I'm one of those neanderthals who still does HTML and XML development by hand in a text editor (or in source code), using uppercase is much handier. It won't matter nearly as much to me when I find a program that I actually _like_ to use for markup creation. Some programs use all lower-case; I find that it doesn't stand out nearly as well for me. I don't think anyone would have a problem with your proposal; just make sure you make clear to all participants that _case matters_ when starting out on your DTD design process. Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sat Apr 25 18:08:43 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:48 2004 Subject: XML-DEVIL Proposal - was Open Standards Processes Message-ID: <3.0.32.19980425090554.00af2af0@pop.intergate.bc.ca> At 07:02 PM 4/24/98 -0500, len bullard wrote: >Jon Bosak wrote: >>But the AC is >> exactly what its name says -- advisory. In the W3C, the Director is >> ultimately responsible for all decisions made by the consortium, >> regardless of the input of its member organizations. > >Wow! The information technologies of the free world are subject to the >whims of one individual. To be fair, while the structure is a bit weird on the face of it, there's a good reason. Namely, a consortium such as W3C is highly vulnerable to litigation on antitrust and restraint-of-trade; anyone whose business plan goes up in smoke because the W3C blessed something incompatible might be inclined to sue, whether or not this is reasonable, simply hoping to be bought off. If you look closely, legally, the W3C hardly exists at all - there is a structure of contracts with MIT, Inria, and Keio, and the decision-making is de jure done by one individual whom it would be worth no-one's while to sue. In practice, the W3C management and staff do pay careful attention to the views of its members. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthew at praxis.cz Sat Apr 25 18:42:20 1998 From: matthew at praxis.cz (Matthew Gertner) Date: Mon Jun 7 17:00:48 2004 Subject: Open Standards Processes Message-ID: <01bd7069$2cb5a260$020b0ac0@xerius> Agreed. When you get right down to it, the same probably goes for unfocused rambling about inheritance, semantics, etc. (although personally I enjoy it thoroughly). Anyone interested in setting up xml-pontification? :-) Matthew -----Original Message----- From: Simon St.Laurent To: Xml-Dev (E-mail) Date: Friday, April 24, 1998 8:34 PM Subject: Re: Open Standards Processes >The W3C isn't always beautiful, but I think we'd better get back to XML >discussion instead of W3C discussion before the list masters have to come out >and whack us with small sticks. One of XML-Dev's best features is its ability >to stay focused on _XML_. > >Unfortunately, I don't know of any good forums for >discussing/pondering/plotting against/praising the W3C. > >Simon St.Laurent >Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sat Apr 25 19:32:11 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:48 2004 Subject: Open Standards Processes In-Reply-To: <01bd7069$2cb5a260$020b0ac0@xerius> Message-ID: <3.0.1.16.19980425172651.abe7d020@pop3.demon.co.uk> At 18:42 25/04/98 +0200, Matthew Gertner wrote: >Agreed. When you get right down to it, the same probably goes for unfocused >rambling about inheritance, semantics, etc. (although personally I enjoy it >thoroughly). Anyone interested in setting up xml-pontification? :-) I don't think the discussion on here has been unfocussed and - although I have been too busy to read it all - seems to have been valuable. > [...] > >>The W3C isn't always beautiful, but I think we'd better get back to XML >>discussion instead of W3C discussion before the list masters have to come >out >>and whack us with small sticks. One of XML-Dev's best features is its >ability >>to stay focused on _XML_. >> >>Unfortunately, I don't know of any good forums for >>discussing/pondering/plotting against/praising the W3C. >> >>Simon St.Laurent I try not to intervene in discussions. [I'm conscious that about 6 months ago I probably was too narrow and suggested that a discussion on inheritance, etc. (or something similarly difficult) was not moving towards something of concrete value.] I have been involved in flame wars - one of such Shakespearean bizarreness that I dare not repeat it and am glad it's wiped from human e-history - an know how easily they get out of hand. So in the present case I wanted to avoid a slanging match. The subsequent discussion has been valuable - such as JonB reminding us of SGML-OPEN/OASIS. I accept his view that OASIS is more suitable for some activities than XML-DEV, but think that XML-DEV provides a different way of doing things that is valuable and probably complementary. I would simply ask that if XML-DEV is unaware of something that's already in progress, intended to become publicly available and likely to lead to a better outcome than XML-DEV would lead to, please let us know. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Sat Apr 25 19:38:32 1998 From: jtauber at jtauber.com (James K. Tauber) Date: Mon Jun 7 17:00:48 2004 Subject: Inheritance in XML In-Reply-To: <199804230428.VAA19547@boethius.eng.sun.com> Message-ID: <001501bd7070$e5a84460$e06118cb@caleb> > Here are some examples of things that can provide semantics for XML > documents: [...] > * Stylesheeets. Sorry to bring it up again, but I still think we should distinguish semantics from behaviour. Stylesheets *do not* provide semantics, they provide behaviour. Sure, how something behaves ought to be linked to what it means but it ain't the stylesheet that's providing the meaning. Meaning/semantics is surely given by schemata that associate element types, etc with ontologies. [...] > XSL will provide what is intended to be the most > powerful standardized high-level way to associate presentational > semantics with XML documents in publishing environments. "Presentational semantics" is mixing two distinct things, IMHO. James -- James Tauber / jtauber@jtauber.com Perth, Western Australia XML Pages: http://www.jtauber.com/xml/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthew at praxis.cz Sat Apr 25 19:51:50 1998 From: matthew at praxis.cz (Matthew Gertner) Date: Mon Jun 7 17:00:48 2004 Subject: XML-DEVIL Proposal - was Open Standards Processes Message-ID: <01bd7072$dc754ee0$020b0ac0@xerius> (In light of my previous posting I should probably keep my mouth shut, but the discussion has now become concrete enough that it surely can no longer be labelled "pontification". Ah well...) Jon, >Please understand that I'm not arguing against your idea, which has >occurred to others before you and undeniably has a certain amount of >charm. I'm just trying to tell you from personal experience that >maintaining the confidentiality requirement is an enormous pain in the >ass and will most definitely crimp your style. Until a few weeks ago I was the W3C representative for my previous employer (POET Software), and my personal experience was that the burden of maintaining the confidentially requirement was far outweighed by the advantages of being able to do your work on the basis of current specs and discussion. Joining the proposed XML-DEVIL would be associated with a certain cost, so everyone and his brother is not going to do so. Besides this, I suspect that most people with whom we would be interested in sharing confidential information (i.e. other XML developers) would join up anyway if this idea ends up flying. Finally, my experience with the XML developer community is that it is a remarkably intelligent and disciplined (if somewhat opinionated :-) group who would be very rigorous about not spilling the beans. >Right. And one of the things I was trying to say more subtly before >but apparently need to be more direct about is that you are imagining >a group of potential competitors (rather than the employees of a >single corporation) deciding which *one* of them is going to represent >the group to the W3C Director. What fun you'll have with that part! >And what a wonderful time you'll have specifying in detail the >procedures you will follow to determine what position that one person >represents on each issue! Not to mention the procedures for deciding >who gets to be on which working group... I dare say the same applies to the OMG (a current member). I honestly don't see a problem here, for several reasons: 1) We are talking about a group of people who are extraordinarily open about sharing their work with others (witness SAX and the myriad of implementations, extensions, etc. that are now freely available). 2) Even the interests of competitors are likely to be aligned on most issues. 3) Participating in a working group implies a very significant time investment, so the crown will in most cases go to whomever (if anyone) is willing to make this investment. 4) We could agree to provide concrete input only on issues where we have a clear consensus. In other cases, we would revert to a pure observer role, which is of great value in itself. >I completely understand and sympathize with this view. All I'm trying >to say is that I believe you can accomplish more with less effort by >utilizing an existing corporate infrastructure (including web site, >mailing lists, board of directors, marketing staff, industry contacts, >conference schedule, and the rest of the apparatus that you will >eventually need) that is capable right now of serving exactly the >function that you describe. Is OASIS a member of the W3C? If so, you are absolutely right. Why reinvest the wheel? On the other hand, I checked the member list and couldn't find them. If OASIS is not a member, joining it would certainly interest many on this list, but it doesn't address the issue of gaining advanced access to W3C works-in-progress. Cheers, Matthew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Sat Apr 25 20:05:39 1998 From: jtauber at jtauber.com (James K. Tauber) Date: Mon Jun 7 17:00:48 2004 Subject: Inheritance in XML [^*] In-Reply-To: <353FCC47.54122231@technologist.com> Message-ID: <001601bd7074$b0eed000$e06118cb@caleb> > Well, here I am off being the theorist again. Mind if I join you :-) > Just like you, I use semantics and meaning interchangably > also (at least I > think I do). If you've taught an XML course recently, I'll > bet you find > yourself saying: "this syntax *means* that..." (e.g. "The > start tag means > that an element has begun.") That's right. XML, as a language, has semantics, syntax and also presentation. I would suggest all languages do. But you are right about different levels. We need to be careful whether we are talking about XML as a language or XML documents. The latter express principally syntax which is then associated with semantics (in schemata, which can also say something about syntax) and with presentation and behaviour (in stylesheets, or as I prefer to call them, behaviour sheets). James -- James Tauber / jtauber@jtauber.com Perth, Western Australia XML Pages: http://www.jtauber.com/xml/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Sat Apr 25 20:09:07 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:00:48 2004 Subject: Open Standards Processes References: <3.0.32.19980425090554.00af2af0@pop.intergate.bc.ca> Message-ID: <354225F0.4632@hiwaay.net> Tim Bray wrote: > To be fair, while the structure is a bit weird on the face of it, there's > a good reason. Namely, a consortium such as W3C is highly vulnerable to > litigation on antitrust and restraint-of-trade; anyone whose business > plan goes up in smoke because the W3C blessed something incompatible might > be inclined to sue, whether or not this is reasonable, simply hoping to > be bought off. If you look closely, legally, the W3C hardly exists > at all - there is a structure of contracts with MIT, Inria, and Keio, > and the decision-making is de jure done by one individual whom it would > be worth no-one's while to sue. > > In practice, the W3C management and staff do pay careful attention to > the views of its members. -Tim It is understood. I don't attribute the problems to maliciousness or conspiracy. Immaturity and inexperience are the likely culprits. I remember saying to Connoly et al some years ago that to make the HTML/HTTP systems work and remain open (the subject was the predatory practices of MS at the time), they would have to become monks vowing poverty. Sounds like that was the solution but hasn't worked. (BTW: not pounding on MS. They have played by the rules in XML. It is the rules of the W3C that are flawed. They suit the egos of some and the needs of the owners, but ultimately, are destructive.) The issue is the openness of the process, the lack of which paralyzes competition. Too often the excuse "Internet time" disallows careful consideration in open forum. Yet, Internet time also makes it too easy for any company to get the kind of market lead that eliminates competition. Only a very experienced and resourceful company such as MS was able to withstand and overcome the lead of NS in the browser market. As the twig is bent... Results and means are both important. As was feared originally, the closure of the XML process enabled it to achieve the necessary technical standard, but at a cost of the openness which characterized the goals, the means, and the individuals whose staunch ethics with regards to their work made SGML one of the truly independent standards. One price of this was SGML's disregard for the implementation/systemic requirements. The consequence of this has been discussed elsewhere. Yet the results of XML may be more limited, for while it has a perceived new quality, it is still essentially the work of Dr. Goldfarb, et al with systemic extensions. Granted, the tactic of renaming and representing it as an emerging technology freed it from the hobbling effects of years of standards wrangling followed by the well-promoted misconceptions of those who designed HTML/HTTP, it has been at the cost of surrendering it to the kind of domination which Dr. Goldfarb and the ISO working groups resisted even at great personal cost. Markup, by design, frees the information. XML, by accident or design, may result in a monopoly on the means of production. This is at the heart of the DOJ actions. The consortium members would be well served by scrupulous records and public view of the XML process should these actions ever be cited in well-financed and well-executed legal actions. That such records may not exist can be laid at the feet of the chair, the process rules, and unwillingness or inability to function in the open. So, perhaps monkdom is the only solution. It probably takes practice to be a good monk and The Director et al may not have mastered the practice. Lawsuits have no regard for the ability to recoup damages. They are often used simply to limit actions. That is the learning curve that Bill Gates refers to. Observing the DOJ actions of late, a friend and coworker observes that we may be entering a time when the societal impact of complex technologies which overlap the demands of law and contractual obligation will spawn a judiciary specialized in the adjudication of technical suits. In other words, lawyers with CS degrees, essentially, Dr. Goldfarb's background. I don't think the computer scientists are ready for the rigor of the juris disciplines. If the standards process is not reopened after this highly bizarre and somewhat unethical heisting of the international process, I do think that the W3C will become entangled in the suits emerging from the DOJ. The VRML Consortium has a much better model and it will behoove the Director to learn from it. The recommendations I made in the earlier post are sound. How odd that some will say "The HTML experience taught us the futility of the open list process" when apparently that process worked and produced a winning solution. len bullard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Sat Apr 25 20:12:35 1998 From: jtauber at jtauber.com (James K. Tauber) Date: Mon Jun 7 17:00:48 2004 Subject: Semantics (was Re: Inheritance in XML [^*]) In-Reply-To: <005b01bd6f34$ee1bbe20$820b4ccb@NT.JELLIFFE.COM.AU> Message-ID: <001701bd7075$6a216560$e06118cb@caleb> > I think people get confused between the "semantics" of the > markup language > and the "semantics" of the element [type]s and attributes > being marked-up. I just sent off mail trying to say the same thing, but not nearly as well as you've said it Rick. > They are "syntactic" in relation to the document being marked-up, > but they are "semantic" in relation to the XML spec. It is because > people make the transition between the two contexts that the > term "semantic" becomes confusing. Exactly. The XML *Language* has semantics, syntax and presentation. They are all expressed in the XML spec. Element types and attributes also have semantics, syntax and presentation. They are defined in DTDs, schemata and behaviour sheets (*not* respectively, though). James -- James Tauber / jtauber@jtauber.com Perth, Western Australia XML Pages: http://www.jtauber.com/xml/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthew at praxis.cz Sat Apr 25 20:17:00 1998 From: matthew at praxis.cz (Matthew Gertner) Date: Mon Jun 7 17:00:48 2004 Subject: LISTRIVIA (was Re: Open Standards Processes) Message-ID: <01bd7076$540caa90$020b0ac0@xerius> Peter, >>Agreed. When you get right down to it, the same probably goes for unfocused >>rambling about inheritance, semantics, etc. (although personally I enjoy it >>thoroughly). Anyone interested in setting up xml-pontification? :-) > >I don't think the discussion on here has been unfocussed and - although I >have been too busy to read it all - seems to have been valuable. >I try not to intervene in discussions. [I'm conscious that about 6 months >ago I probably was too narrow and suggested that a discussion on >inheritance, etc. (or something similarly difficult) was not moving towards >something of concrete value.] I only dared make this remark since I like to pontificate as much as anyone. Certainly, both of the topics I mentioned have been the subject of some very stimulating discussion. I'm actually surprised that you have such a relaxed attitude about this (probably because I read the message six months ago that you refer to). To be honest, I'd love to have an xml-pontification list (or xml-rambles or whatever), but not because I am annoyed by the lack of focus on this list. Quite on the contrary: I have been too busy to read all the mails and I have done so anyway. The real reason is that I am reluctant to post in a lot of cases because I feel that maybe others would object to the discussion straying too far from real XML development issues. If this isn't the case, hallelujah! Matthew xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sat Apr 25 21:39:07 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:48 2004 Subject: Open Standards Processes Message-ID: <3.0.32.19980425123740.00b0ea80@pop.intergate.bc.ca> 1. The supposition that the XML process was in any material way less open than the SGML process is simply wrong. XML was aggressive about seeking out invited experts to serve on the SIG mailing list, which had very substantial influence on the shape of the spec. In particular, compare, in the XML process versus any other, the number of people and organizations who were actively on top of the spec, really understood the issues, and provided serious input. On that basis, XML's input head count is exceeded only by a few of the bigger IETF efforts. 2. The supposition that the HTML standardization process can be said, in any meaningful sense, to have worked, is simply wrong. Anybody who says this obviously has not tried to implement code that processes what the marketplace perceives to be HTML. This is defined not by any spec, but by a basis of functionality that was in Netscape 2, and an unholy mess of accretions, with only two companies really allowed to play. I think a standard should be something that should serve as the basis for implementation. XML is. HTML isn't. 3. It *is* the case that the W3C process is, by default, less open than some others, in particular IETF. The hypothesis is that in web-space, where there are lots of $N*10^7 bets on the table and attack-trained marketing groups behind every bush, there are going to have to be some closed doors to get anything useful done. I think the jury's still out on that, and I'm not sure that XML, which made an aggressive effort to be more open than the W3C default, really serves as evidence either way. 4. A couple of people made the excellent point that it's tough to produce a book on one of these specs in a timely and accurate fashion if you're not inside the process. It seems to me that it would be of huge benefit for the W3C if such books were easier to produce. It might make all sorts of sense for the W3C to have "writers' memberships" - non-speaking access to the materials of one activity or another. Such memberships wouldn't be free, a cost of perhaps $500 or so would bring it well within the bounds of a book-publishing budget while discouraging frivolity. And once again, I regret that the XML process has failed to meet Len Bullard's exquisitely high standards. Cheers, Tim Bray tbray@textuality.com http://www.textuality.com/ +1-604-708-9592 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Sat Apr 25 21:51:45 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:00:48 2004 Subject: Open Standards Processes In-Reply-To: <3.0.32.19980425123740.00b0ea80@pop.intergate.bc.ca> (message from Tim Bray on Sat, 25 Apr 1998 12:37:41 -0700) Message-ID: <199804251949.MAA20585@boethius.eng.sun.com> [Tim Bray:] [... good stuff that I completely agree with ...] | 4. A couple of people made the excellent point that it's tough to | produce a book on one of these specs in a timely and accurate fashion | if you're not inside the process. It seems to me that it would be of | huge benefit for the W3C if such books were easier to produce. It | might make all sorts of sense for the W3C to have "writers' | memberships" - non-speaking access to the materials of one activity or | another. Such memberships wouldn't be free, a cost of perhaps $500 or | so would bring it well within the bounds of a book-publishing budget | while discouraging frivolity. I think that's a really excellent suggestion. And the cost should come out of the publisher, up front. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mintert at irb.informatik.uni-dortmund.de Sat Apr 25 22:01:46 1998 From: mintert at irb.informatik.uni-dortmund.de (Stefan Mintert) Date: Mon Jun 7 17:00:48 2004 Subject: "writers' memberships" Re: Open Standards Processes In-Reply-To: Your message of Sat, 25 Apr 1998 12:49:14 -0700. <199804251949.MAA20585@boethius.eng.sun.com> Message-ID: <199804252001.WAA20255@brown.informatik.uni-dortmund.de> Hi! --------- > [Tim Bray:] > > [... good stuff that I completely agree with ...] > > | 4. A couple of people made the excellent point that it's tough to > | produce a book on one of these specs in a timely and accurate fashion > | if you're not inside the process. It seems to me that it would be of > | huge benefit for the W3C if such books were easier to produce. It > | might make all sorts of sense for the W3C to have "writers' > | memberships" - non-speaking access to the materials of one activity or > | another. Such memberships wouldn't be free, a cost of perhaps $500 or > | so would bring it well within the bounds of a book-publishing budget > | while discouraging frivolity. > > I think that's a really excellent suggestion. And the cost should > come out of the publisher, up front. > > Jon Since I'm currently writing a book about XML I have to break my silence ;-) I would appreciate such a "writers' membership" VERY much. Where's the right place to discuss that? (I don't think XML-Dev is) Bye, Stefan. +-----------------------------------------------------------+ Stefan Mintert UniDo: mintert@irb.informatik.uni-dortmund.de private: stefan@mintert.com WWW: http://www.informatik.uni-dortmund.de/~sm/ +-----------------------------------------------------------+ "let the music keep our spirits high..." (Jackson Browne) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Sat Apr 25 22:13:36 1998 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 17:00:49 2004 Subject: Open Standards Processes Message-ID: It sounds like the 'small sticks' are quite firmly sheathed, which may have something to do with the fact that this discussion seems to be producing some positive results. After taking a quick look at the membership list, I noticed that at least two key players aren't members: Opera Software and the Apache Group. Opera (http://www.operasoftware.com), in Norway, is producing a much smaller Web browser than either Netscape or Microsoft, and has been getting a lot more mention lately everywhere I look. The Apache Group (http://www.apache.org) is a looser organization of the people keeping up Apache, the freeware HTTP server that continues to outpace MS and NS in Internet market share. Opera is still tiny, while Apache is amorphous. Still, it seems like the W3C would do well to actively live up to its claim that: >In its quest for universality, W3C seeks a diverse membership from around the world. Opera may well join on its own - I have no contacts there and can't speculate; they just seem like an odd company to be missing - but I don't know if the Apache folks could join if they wanted to. As their model of software development is spreading rapidly, with Linux, DNS, Perl, and now Netscape joining the fray, the W3C's membership policies seem more and more of an anachronism every day. Accomodating these kinds of members may ruffle some feathers among the W3C's current members, but it seems like it would create a fuller, more capable organization. It might also allow the XML-DEVIL organization, comprised of XML developers working toward common goals, to join the W3C without seeming like a weird attempt to 'subvert' the organization. I remain pleased with how the XML process has worked, though I too might join XML-DEVIL (or something of the sort) for the heads-up and opportunity to participate that a W3C affiliation would bring. Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Sat Apr 25 22:19:07 1998 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 17:00:49 2004 Subject: Open Standards Processes Message-ID: Tim Bray: | It | might make all sorts of sense for the W3C to have "writers' | memberships" - non-speaking access to the materials of one activity or | another. Such memberships wouldn't be free, a cost of perhaps $500 or | so would bring it well within the bounds of a book-publishing budget | while discouraging frivolity. Jon Bosak: >I think that's a really excellent suggestion. And the cost should >come out of the publisher, up front. It sounds like there are going to be a lot of these sold, especially for XML, if they come to pass. That kind of access would certainly settle most of my needs, but I wouldn't count on it to pacify those with 'exquisitely high standards' who want to participate more actively. Still, it's a great start! (Does O'Reilly get a discount? They're already a member, though I think it's their software side.) Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Sat Apr 25 23:02:24 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:00:49 2004 Subject: Open Standards Processes In-Reply-To: <01bd6f97$541355a0$020b0ac0@xerius> (matthew@praxis.cz) Message-ID: <199804252100.OAA20682@boethius.eng.sun.com> [Matthew Gertner:] | Your argument is convincing, but doesn't explain why open access is not | given to works-in-progress for consultation by interested parties (i.e. | read-only access). Sorry I missed this earlier; there's a very simple answer. The specification work unavoidably reveals details about product plans. That's why there are confidentiality restrictions. Make access open to everyone and the big companies won't play. (Please don't tell me why they should think differently or whether this position makes sense. I'm just reporting the facts.) Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Sun Apr 26 00:34:28 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:00:49 2004 Subject: XML-DEVIL Proposal - was Open Standards Processes In-Reply-To: <3.0.32.19980425090554.00af2af0@pop.intergate.bc.ca> (message from Tim Bray on Sat, 25 Apr 1998 09:06:38 -0700) Message-ID: <199804252232.PAA20733@boethius.eng.sun.com> [Tim Bray:] | In practice, the W3C management and staff do pay careful attention to | the views of its members. -Tim Yes, and I certainly didn't intend my informational comment about the organizational structure of the W3C to suggest otherwise. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Sun Apr 26 00:55:31 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:00:49 2004 Subject: Inheritance in XML In-Reply-To: <001501bd7070$e5a84460$e06118cb@caleb> (jtauber@jtauber.com) Message-ID: <199804252253.PAA20735@boethius.eng.sun.com> | > Here are some examples of things that can provide semantics for XML | > documents: | [...] | > * Stylesheeets. | | Sorry to bring it up again, but I still think we should distinguish | semantics from behaviour. Stylesheets *do not* provide semantics, they | provide behaviour. Sure, how something behaves ought to be linked to | what it means but it ain't the stylesheet that's providing the | meaning. Oh dear oh dear. Repeat this saying ten times before you go to bed every night: The meaning of a word is its use. (L. Wittgenstein) This is where arguments about the meaning of meaning finally end up. I think he was right, but no wonder the guy killed himself. | Meaning/semantics is surely given by schemata that associate element | types, etc with ontologies. I agree in principle, but those ontologies at this point are, as far as I know, completely dependent on natural languages for their own meaning. It seems to me that the understanding of meaning by machines can come only after we have agreed on a universal taxonomy of ideas that transcends all the different assumptions that underly natural language -- something like Roget's original system, but consisting of language-independent primitives. I'm not holding my breath until we see that. And in the absence of this semantic substrate, I think that the only meaning we get is behavior and appearance, which the machines can handle, and good old prose, which they can't. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Sun Apr 26 01:08:53 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:00:49 2004 Subject: XML-DEVIL Proposal - was Open Standards Processes In-Reply-To: <01bd7072$dc754ee0$020b0ac0@xerius> (matthew@praxis.cz) Message-ID: <199804252306.QAA20739@boethius.eng.sun.com> | Is OASIS a member of the W3C? If so, you are absolutely right. Why | reinvest the wheel? On the other hand, I checked the member list and | couldn't find them. If OASIS is not a member, joining it would | certainly interest many on this list, but it doesn't address the issue | of gaining advanced access to W3C works-in-progress. Good question. It seems to me that OASIS is in the process of becoming a W3C member organization; it certainly should be. I've put in a request for an update on this and will let you know as soon as I hear anything. I have way too much sympathy for the feelings being expressed here to argue the point any further (which is a point about tactics, not goals). A number of people have told me over the last year how much they wished they had the money to buy a W3C membership. In their position, I would feel exactly the same way. So go for it, and good luck to you. And if you folks do succeed in developing a model for a sustainable W3C member organization that doesn't kill you in the process, I don't see why you should stop with just one. How about regional chapters? Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Sun Apr 26 01:19:24 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:00:49 2004 Subject: "writers' memberships" Re: Open Standards Processes In-Reply-To: <199804252001.WAA20255@brown.informatik.uni-dortmund.de> (message from Stefan Mintert on Sat, 25 Apr 1998 22:01:31 +0200) Message-ID: <199804252317.QAA20741@boethius.eng.sun.com> [Stefan Mintert:] | Since I'm currently writing a book about XML I have to break my | silence ;-) | | I would appreciate such a "writers' membership" VERY much. Where's the | right place to discuss that? (I don't think XML-Dev is) The right place is the W3C Advisory Committee, which next meets in June. Hmmm... I would volunteer to push this, but I honestly can't commit the energy. (I'm behind deadline this very moment and cannot believe that I've even allowed myself to get sucked into this round of correspondence.) About the best I can do is try to bring this to the attention of people who have the bandwidth to lobby for it. If any of you work for organizations that are W3C members, bring this up with your AC rep. I'll do the same. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sun Apr 26 01:26:50 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:49 2004 Subject: Open Standards Processes References: <199804252100.OAA20682@boethius.eng.sun.com> Message-ID: <3542714B.83224C4F@technologist.com> Does anyone have a theory as to why some standards still "work" in the IETF process? Some ramblings: I think that big companies will work in an open environment when they are forced to. If the W3C hasn't got the staff to take on a particular task, then the IETF continues to do it and the big companies grit their teeth and "play ball." If I'm right on that (and I don't know if I am), then if the W3C did NOT exist, then vendors (who must wrap themselves in open standards) might be forced to play ball in open organizations. Or they might create a W3C-like consortium themselves. I'm not sure if Netscape and Microsoft would actually get together and do anything without TimBLs lead. Perhaps Netscape and allies *vs.* Microsoft. But then Microsoft could wrap themselves in the open standards banner by aligning with the truly open organization. I think that open processes do not necessarily have to use the IETF model: "everyone is equal, everyone yells, nobody gets their way". Rather, an open standards process could be organized as we organize governments: hierarchically and representatively. In other words, an IETF-like organization could set up tightly organized, bueraucractic, working groups just as the W3C does. (although I doubt that the IETF themselves would do that...) Big corporations could muscle their way into the "inner circle" by having employees vote (as they should!), but there could be an upper bound per big company (or even quotas set aside for various groups: small companies, users, etc.). It was not openness that made HTML impossible to standardize within the IETF. It was a poorly designed/defined standardization process. The reason SAX worked was because there was a benevolent dicator, an "inner circle" of implementors and deadlines. In other words, there was hierarchy and process. It might be interesting to see what would happen if a completely open organization were to submit a spec. to the W3C (i.e. SAX). Then we might have the best of both worlds. TimBL's blessing would encourage vendors to implement it, but the process would be open. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Sun Apr 26 03:13:21 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:00:49 2004 Subject: XML Developer Independent Lobby - A Call for Membership Message-ID: <00bd01bd70af$8830a900$2ee044c6@arcot-main> Hello, It is my opinion that further discussion on the topic is unnecessary and even harmful to the sanity of Peter the Great, Czar of Listmania. If you are interested in joining a non-profit organization of independent XML developers and writers, NOW is the time to step forward. We have about 10 willing to join, we need 10 more. The person to contact is Frank Boumphrey ( bckman@ix.netcom.com ). Expected immediate cost is few hundred dollars. Expected long term cost is absolute confidentiality and willingness to put the interest of the community ahead of your own. Thank you very much, Don Park http://www.docuverse.com/personal/index.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Sun Apr 26 03:21:47 1998 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:00:49 2004 Subject: Open Standards Processes Message-ID: <01bd70c5$7250ad40$02addccf@uspppBckman> Actually most writers wouldnt mind paying the $500 out of their own pocket just to relieve the wear and tear on their stomach lining!! Frank -----Original Message----- From: Jon Bosak To: xml-dev@ic.ac.uk Date: Saturday, April 25, 1998 12:53 PM Subject: Re: Open Standards Processes >[Tim Bray:] > >[... good stuff that I completely agree with ...] > >| 4. A couple of people made the excellent point that it's tough to >| produce a book on one of these specs in a timely and accurate fashion >| if you're not inside the process. It seems to me that it would be of >| huge benefit for the W3C if such books were easier to produce. It >| might make all sorts of sense for the W3C to have "writers' >| memberships" - non-speaking access to the materials of one activity or >| another. Such memberships wouldn't be free, a cost of perhaps $500 or >| so would bring it well within the bounds of a book-publishing budget >| while discouraging frivolity. > >I think that's a really excellent suggestion. And the cost should >come out of the publisher, up front. > >Jon > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Apr 26 04:07:45 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:49 2004 Subject: Open Standards Processes Message-ID: <3.0.32.19980425190614.00ac5100@pop.intergate.bc.ca> At 08:43 PM 4/25/98 -0700, Frank Boumphrey wrote: >Actually most writers wouldnt mind paying the $500 out of their own pocket >just to relieve the wear and tear on their stomach lining!! I'll plug the idea to a few W3C folk. Software companies let writers in under the covers so that there will be books when the product ships, why shouldn't W3C do the same thing? Having said that, the publisher should pay, not the writer. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lisarein at finetuning.com Sun Apr 26 05:21:22 1998 From: lisarein at finetuning.com (Lisa Rein) Date: Mon Jun 7 17:00:49 2004 Subject: Open Standards Processes References: <01bd70c5$7250ad40$02addccf@uspppBckman> Message-ID: <3542ADBF.3065794E@finetuning.com> speak for yourself! Technically-accurate coverage takes twice as long and pays half! Plus the spec editors get a free copy edit, while we're at it.... Maybe the W3C should pay us! :-) lisa Frank Boumphrey wrote: > > Actually most writers wouldnt mind paying the $500 out of their own pocket > just to relieve the wear and tear on their stomach lining!! > > Frank > -----Original Message----- > From: Jon Bosak > To: xml-dev@ic.ac.uk > Date: Saturday, April 25, 1998 12:53 PM > Subject: Re: Open Standards Processes > > >[Tim Bray:] > > > >[... good stuff that I completely agree with ...] > > > >| 4. A couple of people made the excellent point that it's tough to > >| produce a book on one of these specs in a timely and accurate fashion > >| if you're not inside the process. It seems to me that it would be of > >| huge benefit for the W3C if such books were easier to produce. It > >| might make all sorts of sense for the W3C to have "writers' > >| memberships" - non-speaking access to the materials of one activity or > >| another. Such memberships wouldn't be free, a cost of perhaps $500 or > >| so would bring it well within the bounds of a book-publishing budget > >| while discouraging frivolity. > > > >I think that's a really excellent suggestion. And the cost should > >come out of the publisher, up front. > > > >Jon > > > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > >(un)subscribe xml-dev > >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > >subscribe xml-dev-digest > >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Sun Apr 26 06:35:47 1998 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:00:49 2004 Subject: Inheritance in XML Message-ID: <01bd70e6$a81e7800$efacdccf@uspppBckman> The meaning of a word is its use. (L. Wittgenstein) "When I use a word it means exactly what I mean it to mean" (Tweedledum) Heres what the OED (Oxford English Dictionary) has to say on the subject semantics:- relating to meaning behaviour:- Conduct or course of action ...handling, disposition of.. schemata:- Any one of certain forms or rules of the 'productive imagination' through which the understanding is able to apply its 'catagories' to the manifold of sense-perception in the process of realizing knowledge or experience. (Now that really clarifies the meaning of that word!!!) ontologies:- Studies pertaining to the nature of being or the essence of things, or to being in the abstract. taxonomy:-That particular branch of science that relates to classification. (Now that one I knew!!) So according to the OED style sheets handle both semantics and and behaviour xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lisarein at finetuning.com Sun Apr 26 07:16:51 1998 From: lisarein at finetuning.com (Lisa Rein) Date: Mon Jun 7 17:00:49 2004 Subject: Inheritance in XML References: <01bd70e6$a81e7800$efacdccf@uspppBckman> Message-ID: <3542C8D1.ED447D6@finetuning.com> Yes, but they do so declaratively, right? Rather than programmatically. lisa Frank Boumphrey wrote: > > The meaning of a word is its use. (L. Wittgenstein) > "When I use a word it means exactly what I mean it to mean" (Tweedledum) > > Heres what the OED (Oxford English Dictionary) has to say on the subject > > semantics:- relating to meaning > > behaviour:- Conduct or course of action ...handling, disposition of.. > > schemata:- Any one of certain forms or rules of the 'productive imagination' > through which the understanding is able to apply its 'catagories' to the > manifold of sense-perception in the process of realizing knowledge or > experience. (Now that really clarifies the meaning of that word!!!) > > ontologies:- Studies pertaining to the nature of being or the essence of > things, or to being in the abstract. > > taxonomy:-That particular branch of science that relates to classification. > (Now that one I knew!!) > > So according to the OED style sheets handle both semantics and and behaviour > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Sun Apr 26 07:33:19 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:00:49 2004 Subject: Open Standards Processes References: <199804252100.OAA20682@boethius.eng.sun.com> <3542714B.83224C4F@technologist.com> Message-ID: <3542C6B5.58C0@hiwaay.net> Paul Prescod wrote: > > Does anyone have a theory as to why some standards still "work" in the > IETF process? > > Some ramblings: > > I think that big companies will work in an open environment when they are > forced to. If the W3C hasn't got the staff to take on a particular task, > then the IETF continues to do it and the big companies grit their teeth > and "play ball." Not necessarily. ISO has been persuasive in some areas of standardization and not others. It might be interesting to know which have and have not produced standard *technology*. > It was not openness that made HTML impossible to standardize within the > IETF. It was a poorly designed/defined standardization process. It was inexperience. I can't think of a precedent for it. Online list work is challenging. That does not mean it should be closed. It takes practice and patience to make it work. > The reason > SAX worked was because there was a benevolent dicator, an "inner circle" > of implementors and deadlines. In other words, there was hierarchy and > process. Yes. We did IrishSpace the same way. Question: did you design a standard or a technology? HTML/HTTP worked because it was a *standard technology* aka, a freebie. Short focused application efforts work well. Recruit a team, do the design, etc. Jon says the big companies won't play. Well, if you are trying to write enforceable law, I guess it helps to have an army somewhere to enforce that. But if you want to develop and sell technology, sorry, all you have to do is build to a standard technological base. Right now, for most bets, that means WinTel. > It might be interesting to see what would happen if a completely open > organization were to submit a spec. to the W3C (i.e. SAX). Then we might > have the best of both worlds. TimBL's blessing would encourage vendors to > implement it, but the process would be open. Tim BL's blessing? To create technology? Huh? VRML simply formed their own consortium and sent the spec to ISO. Works fine. The working group lists are all open. Small groups, lead reasonably, work out the details. Frankly, I'm not sure why XML has to be done differently. We are told because big companies won't play otherwise. I don't believe that is generally true. Big companies play in the VRML Consortium and that is a successful effort. No, I don't accept that. Still, that is Jon's assertion. Ok. Where are the big companies that want to sit with the director on stage at a press conference and explain why work they can do in one consortium under open rules they cannot do in another? len bullard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Sun Apr 26 08:26:57 1998 From: jtauber at jtauber.com (James K. Tauber) Date: Mon Jun 7 17:00:50 2004 Subject: Inheritance in XML In-Reply-To: <199804252253.PAA20735@boethius.eng.sun.com> Message-ID: <000e01bd70dc$411122c0$dd6118cb@caleb> [James Tauber] > | Sorry to bring it up again, but I still think we should distinguish > | semantics from behaviour. Stylesheets *do not* provide > semantics, they > | provide behaviour. Sure, how something behaves ought to be linked to > | what it means but it ain't the stylesheet that's providing the > | meaning. [Jon Bosak] > Oh dear oh dear. Repeat this saying ten times before you go to bed > every night: > > The meaning of a word is its use. (L. Wittgenstein) [James Tauber] Maybe, but not in how the word is pronounced or how it appears on the page. That is my point. The sound and meaning of a word might be inseparable but they are not the same thing. Perhaps you misunderstand what I mean by "behaviour". I am referring to how the content will be display and how it will respond to events like clicking. All I am trying to say is that the specification that PatientNames are to be displayed in blue and are to bring up that patient's record when clicked on is not the same thing as semantics and we shouldn't call it semantics. [Jon Bosak] > It seems to me that the understanding of meaning by machines > can come only after we have agreed on a universal taxonomy of ideas > that transcends all the different assumptions that underly natural > language -- something like Roget's original system, but consisting of > language-independent primitives. I'm not holding my breath until we > see that. And in the absence of this semantic substrate, I think that > the only meaning we get is behavior and appearance, which the machines > can handle, and good old prose, which they can't. [James Tauber] I'm not holding my breath either. I just don't want to use the term "semantics" to describe what stylesheets specify. James -- James Tauber / jtauber@jtauber.com Perth, Western Australia XML Pages: http://www.jtauber.com/xml/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Sun Apr 26 08:37:22 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:00:50 2004 Subject: Open Standards Processes References: <3.0.32.19980425123740.00b0ea80@pop.intergate.bc.ca> Message-ID: <3542D5BC.7C60@hiwaay.net> Tim Bray wrote: > > 1. The supposition that the XML process was in any material way less > open than the SGML process is simply wrong. I disagree. Compare the selection rules for membership in working groups. Who chooses the members of the working group for XML? > XML was aggressive about > seeking out invited experts to serve on the SIG mailing list, which > had very substantial influence on the shape of the spec. Yes. > In particular, > compare, in the XML process versus any other, the number of people and > organizations who were actively on top of the spec, really understood the > issues, and provided serious input. On that basis, XML's input head count > is exceeded only by a few of the bigger IETF efforts. This is true. The SIG is well staffed. The best SGML experts in the business are there. Point of history: When SGML was originally created, there was little use of the Internet for list activity of the kind that is now possible. That meant travel and financial support for standards efforts that only companies could afford. So, from that perspective, I concede. As one who encourages lists, I do so because I have seen the inherent limitations of airlines and hotels as the medium of communication for this work. > 2. The supposition that the HTML standardization process can be said, > in any meaningful sense, to have worked, is simply wrong. Anybody who > says this obviously has not tried to implement code that processes > what the marketplace perceives to be HTML. Point of difference: the HTML process produced a technology, not a standard. But to be more truthful, the Mosaic group implemented a technology being argued about by a large list. Considering the average age on that list and the lack of practice, I'm sure it was raucous. BTW: I was part of the team of Lockheed Martin that did implement an SGML and an HTML browser. I am aware of the design's limitations. So, yes, it wasn't perfect technology. Considering the results (The Web), that didn't matter. When the issue of choosing a text design for VRML was discussed, some thought that ONLY HTML should be the basis for that. Some still do. > This is defined not by any spec, > but by a basis of functionality that was in Netscape 2, and an unholy mess > of accretions, with only two companies really allowed to play. Not true. Several companies played. The W3C source was implemented several times. However, Netscape moved fastest and had the freshest, and for that design, most experienced team. So, they extended HTML quickly and cleverly. Extending an SGML application by adding to the DTD is the way its done. To the lasting chagrin of the originators of HTML, they insisted on making a standard of it rather than defining it as a tool, which is what it really is. > I think a > standard should be something that should serve as the basis for > implementation. XML is. HTML isn't. HTML is a DTD. Implementing a DTD IS what you do with it: SGML application. XML is syntax unification. I absolutely agree that this should be a standard. But it isn't. It's the property of a consortium, to paraphrase, "big companies that won't play unless they get their way" and that includes insuring a one year lead time on development. That is anti-competitive as it gets. Say what you want about the SGML process, Charles Goldfarb is a stickler for insuring that this does not happen: ISO rules backed by a man of incomparable commitment to the letter and spirit of the law. Point conceded: W3C makes the rules for W3C processes. The chair and all official members must abide by those rules. It is the rules I question. Given ISO rules, the XML processes would be different. > 3. It *is* the case that the W3C process is, by default, less open > than some others, in particular IETF. The hypothesis is that in > web-space, where there are lots of $N*10^7 bets on the table and > attack-trained marketing groups behind every bush, there are going to > have to be some closed doors to get anything useful done. That is demeaning FUD. I doubt there are professionals on this list that cannot be handled by the other professionals. Offlist is another issue. > 4.... Such memberships wouldn't > be free, a cost of perhaps $500 or so would bring it well within the bounds > of a book-publishing budget while discouraging frivolity. Umm. Why discourage it? It seems odd to me that the right to information which determines the direction of technology and technical markets should be sold as if it were a poker ante. Don't sell cheaper indulgences. The W3C should change its rules. > And once again, I regret that the XML process has failed to meet > Len Bullard's exquisitely high standards. Well, by any standards, your reply, Tim, is very civil. I respect that and thank you for it. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lisarein at finetuning.com Sun Apr 26 09:40:38 1998 From: lisarein at finetuning.com (Lisa Rein) Date: Mon Jun 7 17:00:50 2004 Subject: Open Standards Processes References: <3.0.32.19980425123740.00b0ea80@pop.intergate.bc.ca> <3542D5BC.7C60@hiwaay.net> Message-ID: <3542EA80.C157FC8D@finetuning.com> They can pay their money and get to say they're in the club, ok fine. But the XML Working Group is pretty full. And I'd hate to see it fill up and get bogged down. I wouldn't want people to think all you have to do it join and -- pick your working group. Also I really don't feel like XML was held up. On the contrary...looks like Namespaces got rushed... and what about people like Henry Thompson and James Clark (insert the people I should put here) -- they played an enormous part in XML's development (and the did it for peanuts :-) And what about many of the W3C employees -- they don't profit from software sales - I'm not gonna name names -- but some of these people -- they're working 80 hours a week -- and it's not the most glamourous work they're doing. But there's a sense of commitment to this Web interoperability thing -- a genuine responsibility to finish what was started and make sure it gets done right -- and even if no one appreciates ....now.... $500 memberships are just going to cloud the issues. And as far as most of the press goes. Most of them don't read the specs NOW....even AFTER they're done...so I wouldn't want to see a $500 W3C site-access membership just get written in to CNETs budget every year or anything like that! (shudder) Let's get back to more intersting....and beautiful...and magical things.... like.... architectural forms...:-) lisa len bullard wrote: > > Tim Bray wrote: > > > > 1. The supposition that the XML process was in any material way less > > open than the SGML process is simply wrong. > > I disagree. Compare the selection rules for membership in working > groups. Who chooses the members of the working group for XML? > > > XML was aggressive about > > seeking out invited experts to serve on the SIG mailing list, which > > had very substantial influence on the shape of the spec. > > Yes. > > > In particular, > > compare, in the XML process versus any other, the number of people and > > organizations who were actively on top of the spec, really understood the > > issues, and provided serious input. On that basis, XML's input head count > > is exceeded only by a few of the bigger IETF efforts. > > This is true. The SIG is well staffed. The best SGML experts > in the business are there. > > Point of history: When SGML was originally created, there was > little use of the Internet for list activity of the kind that > is now possible. That meant travel and financial support for > standards efforts that only companies could afford. So, from > that perspective, I concede. As one who encourages lists, I > do so because I have seen the inherent limitations of airlines > and hotels as the medium of communication for this work. > > > 2. The supposition that the HTML standardization process can be said, > > in any meaningful sense, to have worked, is simply wrong. Anybody who > > says this obviously has not tried to implement code that processes > > what the marketplace perceives to be HTML. > > Point of difference: the HTML process produced a technology, not a > standard. But to be more truthful, the Mosaic group implemented a > technology being argued about by a large list. Considering the > average age on that list and the lack of practice, I'm sure it > was raucous. > > BTW: I was part of the team of Lockheed Martin that did > implement an SGML and an HTML browser. I am aware of the > design's limitations. > > So, yes, it wasn't perfect technology. Considering the > results (The Web), that didn't matter. When the issue of > choosing a text design for VRML was discussed, some thought > that ONLY HTML should be the basis for that. Some still do. > > > This is defined not by any spec, > > but by a basis of functionality that was in Netscape 2, and an unholy mess > > of accretions, with only two companies really allowed to play. > > Not true. Several companies played. The W3C source was implemented > several times. However, Netscape moved fastest and had the freshest, > and for that design, most experienced team. So, they extended > HTML quickly and cleverly. Extending an SGML application by > adding to the DTD is the way its done. To the lasting chagrin > of the originators of HTML, they insisted on making a standard > of it rather than defining it as a tool, which is what it really is. > > > I think a > > standard should be something that should serve as the basis for > > implementation. XML is. HTML isn't. > > HTML is a DTD. Implementing a DTD IS what you do with it: SGML > application. > > XML is syntax unification. I absolutely agree that this should be a > standard. But it isn't. It's the property of a consortium, to > paraphrase, "big companies that won't play unless they get their way" > and that includes insuring a one year lead time on development. > That is anti-competitive as it gets. Say what you want about > the SGML process, Charles Goldfarb is a stickler for insuring > that this does not happen: ISO rules backed by a man of > incomparable commitment to the letter and spirit of the law. > > Point conceded: W3C makes the rules for W3C processes. The > chair and all official members must abide by those rules. > It is the rules I question. Given ISO rules, the XML processes > would be different. > > > 3. It *is* the case that the W3C process is, by default, less open > > than some others, in particular IETF. The hypothesis is that in > > web-space, where there are lots of $N*10^7 bets on the table and > > attack-trained marketing groups behind every bush, there are going to > > have to be some closed doors to get anything useful done. > > That is demeaning FUD. I doubt there are professionals on this list > that > cannot be handled by the other professionals. Offlist is another issue. > > > 4.... Such memberships wouldn't > > be free, a cost of perhaps $500 or so would bring it well within the bounds > > of a book-publishing budget while discouraging frivolity. > > Umm. Why discourage it? It seems odd to me that the right to > information which determines the direction of technology and > technical markets should be sold as if it were a poker ante. > > Don't sell cheaper indulgences. The W3C should change its rules. > > > And once again, I regret that the XML process has failed to meet > > Len Bullard's exquisitely high standards. > > Well, by any standards, your reply, Tim, is very civil. > I respect that and thank you for it. > > len > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sun Apr 26 15:27:40 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:50 2004 Subject: LISTMANIA (sic) In-Reply-To: <00bd01bd70af$8830a900$2ee044c6@arcot-main> Message-ID: <3.0.1.16.19980426132406.a3c75d30@pop3.demon.co.uk> At 18:06 25/04/98 -0700, Don Park wrote: >Hello, > >It is my opinion that further discussion on the topic is unnecessary and >even harmful to the sanity of Peter the Great, Czar of Listmania. [the sanity] disappeared a long time ago. "grey-hairs ... a jester and a fool". I have (as I said) only once suggested a discussion was off-course - a decision from which I have acquired wisdom. The current discussions are focussed and valuable. [I will write further about the original motives of XML-DEV, but like all web institutions they evolve.]. In general I only post LISTRIVIA to comment on the heat of some postings (only twice) and on redundant byte-count (frequently). [For any newcomers and those who may have forgotten the communal discipline: - use quoting carefully, and review it to make sure that there is nothing extraneous. - always avoid copying: - the whole of the last message(s) - the XML-DEV sig - always avoid replying to the original poster as well as to the list. ] Both the process and the discussion on ontologies/inheritance represent the problems we are currently facing and IMO are fully appropriate. When XML-DEV started there were essentially no implementations of XML and it was my concern that it would be very easy to discuss complex issues (like the current ones) while neglecting the immediate need to make sure that simple XML actually worked. I have always been keen to see 'semantics' addressed here, since I have felt that it would be a major challenge for the XML community. One aspect (which some of us use 'semantics' for) was how the spec should be interpreted in software. SAX has helped us avoid the first level of problems here, and I hope we can do the same with the next generation of issues (e.g. XLink, where I would very much like to see a generic approach.) PaulP's comment - about the spec only formally requiring whitespace to be passed - shows the sort of hidden assumptions that are inevitable when writing specs (I do not believe that a spec could usefully be written without prose). P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sun Apr 26 17:15:13 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:50 2004 Subject: XML-DEV (was Re: Open Standards Processes) [longish] In-Reply-To: <199804251949.MAA20585@boethius.eng.sun.com> References: <3.0.32.19980425123740.00b0ea80@pop.intergate.bc.ca> Message-ID: <3.0.1.16.19980426150638.a79f9356@pop3.demon.co.uk> Since XML-DEV is a bit over a year old and because of though-provoking and constructive discussion at present and because of the success of SAX, here are some ramblings. A year is a long time and many XML-DEV members were not subscribed when we started, so my rosy-coloured recollections are given. [A lot of the specifics can be generalised to the issues we are addressing now.] We all owe a great deal to Henry Rzepa. [BTW Henry *does* read your mail, and does unsubscribe people, etc. He is just very busy. While I can get a thrill out of writing stuff like this, Henry is getting less of a thrill trying to sort out actually what mail address someone thinks they are subscribed to; why sending 'unsubscribe' to xml-dev doesn't work, etc.] Henry and I are molecular scientists - chemist and crystallographer respectively. He had the vision - perhaps about 7-10 years ago - of the *power of the electronic medium to communicate molecular ideas*. [The phrase is mine - he may disagree :-)]. We formed a symbiotic relationship - we undertake parts of an activity which are complementary. The fundamental aspect of most of what we do is to use the power of the information revolution to create a new way of communicating molecular science. XML-DEV is part of that. Molecular science (like many other disciplines that XML-DEVers will be familiar with) is well-established, with a large information industry and a fragmented approach. We have few formal standards for the communication of chemistry and syntactic mismatch is extremely common. Moreover the semantics of chemistry are very wide - Linus Pauling was probably the last person ever to be able to formulate them. Henry pioneered the use of the WWW for chemistry many ways - development of interactive tools for molecular and spectral display - and has run 3 major e-conferences. Some years ago he had the idea of using MIME to convey chemical information. The two of us spent $10 in a Greenford pub and - with Ben Whittaker - drafted a submission to the IETF for chemical/x-*. This was immediately successful in that the molecular community adopted this almost overnight (ca. 2 months). This - I think - is the ideal that some of us are searching for in current discussions - can we develop something for $10 in an open process that does exactly what the community wants. I use the word 'meme' (from Richard Dawkins - the Selfish Gene) to describe this; it's a self-reproducing idea. Since the WWW is a marvellous incubator for memes, they are very attractive to develop and I believe XML-DEV has and will continue to create them. A meme must have the properties: - it must be rapidly (i.e. minutes) obvious what its point is. - a sufficient percentage of the population must be infectable - the energy required to transmit it must be low compared with its value. A good idea of a meme on the WWW is the 'Home Page'. No committee decided that there should be home pages - but they are self-evidently valuable to a large percentage of us. FAQs are another. I have been searching for many years for the means to convey my ideas (in molecular software). It became clear that with SGML(sic) and the WWW these had arrived and I developed costwish to this end (i.e. a general SGML tool with the specific intention of promoting Chemical Markup Language). But it was very clear that *ML ideas were not going to spread rapidly without portable software, and SGML was not an effective meme (expensive and cumbersome). So when XML arrived I know that the 'right solution' was there - it was a question of how to make sure it got to a stage where the molecular community saw the value and the need to adopt it. I did not expect the molecular community to be in the vanguard of those developing XML solutions and on the whole this is true. The exceptions come from those areas which are already involved with largescale *document* management such as (a few) publishers, patents and regulatory. A key problem in many sciences is the separation of 'documents' from 'data'. The data industry is not used to using *ML approaches, while the document industry usually regards 'data' as no different from any other pixel-based rendering (i.e. semantic content is discarded). XML offers the enormous excitement of creating unified documents and data - and could revolutionise the process of scientific publishing if the communities have the courage. [My own - crystallography - has; it has developed e-publications in which documents and data are combined, but this is very rare.] So the motivation for XML-DEV was to help the generic XML effort succeed. This was by no means certain when XML started. If XML were to have any meme-like qualities it would simple believable software, besides a convincing spec. In my experience it is far easier to write specs than to implement them and I have seen many which have never been effectively implemented. I didn't want this to happen with XML and so H and I offered this list as a means of helping the communal software development process. We have, of course, been very gratified by the large number of people who have announced freely or easily available software here. There are times when the software has had an important effect in highlighting aspects of the spec - for example the difficulty of implementing some of the original PE syntax. Another concern that we had was the likelihood of incompatible implementations. This could have been through misinterpretation of the spec or simply the lack of suitable test apparatus. [BTW I hope that JUMBO2 can act as a simple test apparatus as it can run any SAX-compliant parser]. I have always suggested that XML-DEV should take a lead in aiming for re-usability, compatibility, etc. I am delighted, of course, with DavidM's achievement with SAX - certainly much larger that it seen ed at the start. It is clear that XML is now a believable approach, and it's also clear that - possibly for the first time - difficult aspects of semantics, namespaces, etc. are having to be addressed in public. If XML-DEV can help in this - splendid. If some of these require different organisations, fine. I am now much less concerned than I was a year ago about XML being all talk and no implementation, but I think we always have to remember implementation in most of what we post. Seemingly obvious things can be very difficult to implement - PEs, namespaces have shown that. I suspect that the parser-application relationship will still need a lot of work. Finally. Henry and I heard TimBL talk at WWW1 (CERN) and the formation of (what is now) the W3C was first floated there. It seemed very idealistic, free, new-frontier-like, with talk of 'an electronic bill of rights', etc. Remember at that stage that very little commercial had hit the WWW - most pages were from scientists, a few orgs and IT coms. We didn't know what the W3C would look like. Things then went quiet for some while until we have the W3C as we know it. I sympathise with all the views expressed. I'm an idealist, and when it has been possible to create a $10 meme it's marvellous. Henry and I are planning a follow-up. I've had the same experience with the Globewide Network Academy - a group of enthusiast volunteers (many are graduate students) who see the power of the electronic age with a clarity and confident some of us miss. They have done extraordinary things - just with a few server-side resources and boundless energy. However, these successes are rare and most real-world creations require time, money and a lot of paid dedication. Compared to most 'standards' processes I have found the W3C process for XML and related matters extremely impressive and refreshing. OK, I am partially involved as an XML-SIG member but I understand the frustration of those who do not know what is being discussed. But I do value the way that the XML-SIG has chosen people for their expertise rather than their formal institutional standing [I was unemployed]. I also think that - in XML - there has been relatively little so far that has been a major shock when published in draft. My understanding is that drafts are available internally 1 month before being made publicly available, though obviously discussions can be confidential for somewhat longer. IMO the speed and rigour of the process is very impressive and if this is offset against delay in pre-publication I think it's worth it. If as an individual you want to become involved in shaping the W3C discussions, it probably helps if you have shown expertise publicly, and created tools of benefit to the community. FWIW Henry and I spent a lot of time presenting our case on the MIME discussion list, without formal success :-). I appreciate the problem that XML-writers find themselves in (I may, or may not, be writing an XML book). It's actually a commentary on the outdated nature of the paper-publication process that people have to plan for publication before they know what they are writing about (in detail). E-publication need not suffer from the same constraints - thus (if you regard it as an e-publication) the JUMBO2 CDROM will have the latest, final version of SAX 'hot off the press'. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sun Apr 26 17:19:13 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:50 2004 Subject: Inheritance in XML In-Reply-To: <000e01bd70dc$411122c0$dd6118cb@caleb> References: <199804252253.PAA20735@boethius.eng.sun.com> Message-ID: <3.0.1.16.19980426135051.097f1df0@pop3.demon.co.uk> It is clear that we are at a key stage in developing these areas, and I think we have a shortage of good software to help us. I'd like to see if there are simple communal ways forward. Concentrating on the semantics of DTDs and document instances (i.e. not the interpretation of the spec), it seems that we have (at least) the following toolset: - adding semantic information to the DTD. This would give per-element and per-attribute prose. Unfortunately this is not defined in SGML or XML DTDs - only comments are allowed, and there is no standard for their creation. This makes it extremely difficult (say) even to generate toolTips for the semantics of ELEMENTs. I find this frustrating and potentially quite easily avoidable. For example the idea of recasting the DTD in XML (which I have re-posted recently) allows (at least) Xlink-annotation and possibly formal extensions through XML. - Allowing for semantic information to be added at document authoring time (or later). Thus CML/TecML uses constructs such as ... to categorise quantities and allow full semantic resolution through distributed terminology. [We shall announce the hyperglossary concept and its DTD very shortly.] - linking ELEMENTs to software (i.e. behaviour). This can either be done on an implicit basis (e.g. CML:MOL links to jumbo.cml.MOLNode probably through xml:namespace pointers) or through the stylesheet syntax (which we await - I very much hope it has the ability to embed Java methods). Note that *I* would often classify this as 'semantics' because it is often easier to define technical operations in terms of machine-based rules or specifications rather than prose. As an example, 'electronegativity' might be defined by an algorithm rather than prose. Whether rendering (by stylesheets) adds semantic information will depend to some extend on the culture and experience of the readers. This has so far appeared to be the most prominent way of 'adding semantics'; I'd like to see some support for the others mentioned here. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Sun Apr 26 22:36:59 1998 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:00:50 2004 Subject: 'Heads up' note Message-ID: <01bd716c$d73dcc40$0cacdccf@uspppBckman> Robin, I today put up two simple XML teaching tools at http://www.hypermedic.com/style/tools/tools.htm. They are written in VB, and thus are extreamly small, but will only run on Windows (3.* or 95) platforms. entity.exe (9K) will expand the entities in an XML DTD. The expanded file can be saved as a new XML file. XMLparse.exe (22K)is a simple parsing tool that checks for well-formedness, and allows one to associate or generate a style sheet. It will convert an XML document and CSS/XML style sheet into HTML and a CSS style sheet. The XML semantic information is retained as a class attribute in the HTML tag. It is explained fully in the Readme.txt. The necessary DLL's are also at the above site if needed. I would emphasise that these are teaching tools which have been designed for simplicity of use,and can't handle files larger than 32 K. Frank xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Mon Apr 27 00:43:56 1998 From: ht at cogsci.ed.ac.uk (Henry S. Thompson) Date: Mon Jun 7 17:00:50 2004 Subject: Final alpha release of XED: lots of new features Message-ID: <14547.199804262243@naomi.cogsci.ed.ac.uk> The final (I hope) alpha release (0.3.1) of XED is now available for evaluation: http://www.ltg.ed.ac.uk/~ht/xed.html There are lots of new features, and some changes to the UI, so I hope those of you who have made helpful comments and tried things out will take the time to re-download this version. Changes since the last announcement include: >Changes from v0.2.1.4 as released: Features: Menubar instead of buttons, with operations grouped in pull-downs under 'File', 'Edit', 'Options' (all Alt/Meta key-bindings still work) Preferences settings control key-binding mode, attr quotes, autofill, autoeol, autoindent, autosave Font size menu 'Insert File' command added Can flush tag memory for a clean start on demand Autosaving support C-/ toggles an empty element between
and versions. Cut and Paste under WIN32 use clipboard for import/export Better support for large element type inventories: consistent accelerators based on shortest unique prefix, mixed case handled properly Accelerators for built-in menu entries changed to non-alphameric to avoid clashing with user-defined tags/attrs/entities Double-click selects word/enclosing element/... C-space sets anchor in Unix key-binding mode Cross-modal pasting now allowed (e.g. from attr to content and vice versa) Renaming supported for PIs, unwrapping/wrapping for PIs, CDATA sections and comments Undo possible after save C-g clears selection in Unix key-binding mode File dirty is indicated by File* in menubar under WIN32, italic 'File' under UNIX xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From srn at techno.com Mon Apr 27 03:05:47 1998 From: srn at techno.com (Steven R. Newcomb) Date: Mon Jun 7 17:00:50 2004 Subject: Open Standards Processes (WAS Re: Nesting XML based languages and scripting languages) In-Reply-To: <3541236C.FDF@hiwaay.net> (message from len bullard on Fri, 24 Apr 1998 18:42:36 -0500) References: <01bd6f60$4e4db160$d8addccf@uspppBckman> <199804241122.HAA00254@unready.microstar.com> <3541236C.FDF@hiwaay.net> Message-ID: <199804270103.UAA02407@bruno.techno.com> This note is about the ISO republic vs. the W3C principality, so just skip it unless you want to know how happy I am to be at peace with both of them. Tim Berners-Lee, bless his heart, made some very good early guesses on behalf of all of us, and, as far as I can tell, he continues to do so today. It looks to me like he has (inadvertently or deliberately, I don't know which) created a position for himself not unlike that of Machiavelli's _Prince_, or perhaps the Philosopher King in Plato's _Republic_. This is a crushingly demanding role, and, while I admire his accomplishments, I do not envy him. Let's enjoy him while we can, and thank our lucky stars while he remains both willing and competent to fulfill that role. In spite of my deep and well-founded suspicions about vendor consortia, I've been forced to conclude that the W3C has been a great thing; by their fruits we know them. Even though the ISO did not and could not accomplish what Tim BL and his cohort did with HTML, I still urge my clients to rely on the ISO as the stablest, most broadly based and representative organization of all the economic interests on this planet. The ISO (or some eventual successor organization) will always necessarily exist. Therefore, it will always be there to inherit whatever drops from the nerveless grasp of shorter-lived, less broadly-based organizations. It's important to remember that, even if the Web touches everything, the Web isn't everything. A happy victim of its own success, the Web is in the process of being assimilated by everything. The Web's leadership will not always be so influential. The Web is just one natural step in a networking process, one earlier step of which was the invention of the postage stamp. Today, nobody expects the U.S. Postmaster General to be a world-beater; only the title still reflects the former grandeur of the position. As for XML, I am inclined to withhold all comment except praise for the way the XML thing has been handled. It's really quite an extraordinary example of "recruited public cooperation", and it reflects very well on both the recruiters and the participants. If all we can see is the political structure that allowed this XML crystalization to occur, we can't see what really happened. It's so much more than that, the political structure pales to insignificance. In the XML matter, I find particularly praiseworthy the care that has been taken, by many on all sides, to keep XML and its ISO bases in harmony with one another. You Know Who You Are, and regardless of how things turn out, may future generations bless you for your hard work and thankless commitment to the longterm success of humanity's Civilization Experiment. I'm pleasantly astonished to discover that, as near as I can tell, we're pretty much all fighting on the same side here. We might as well recognize this fact for the miracle that it is, and take maximum advantage of it. Thanks, Len, for making me pinch myself hard enough to realize that, after our years in the wilderness together, things are really pretty damn good and getting rapidly better. -Steve -- Steven R. Newcomb, President, TechnoTeacher, Inc. srn@techno.com http://www.techno.com ftp.techno.com voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137) fax +1 972 994 0087 (at ISOGEN: +1 214 953 3152) 3615 Tanner Lane Richardson, Texas 75082-2618 USA xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Apr 27 04:30:18 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:50 2004 Subject: SAX 1.0beta Message-ID: <199804270226.WAA05250@unready.microstar.com> I have just uploaded a beta version of SAX 1.0, which you can download through the following URL (there is also a link to online JavaDoc documentation): http://www.microstar.com/XML/SAX/New/ The interface is now frozen except for bug fixes -- I promise not to tinker any more. I'll wait a couple of days to collect bug reports, then will fix any bugs or documentation typos reported and will announce SAX 1.0 to the rest of the world. PLEASE send in any bug reports as soon as possible, or the bugs may live on for years in the final SAX. There are a few significant changes from the last week's last pre-release, mostly because I have finally allowed myself to be worn down by good arguments from patient people: - everything that used to throw java.lang.Exception now throws SAXException; SAXException can embed any other kind of exception - use Java-specific InputStream and Reader classes wherever ByteStream and CharacterStream were used before - removed custom stream classes: ByteStream and CharacterStream in core; and ByteStreamAdapter, CharacterStreamAdapter, InputStreamAdapter, ReaderAdapter, and IOExceptionWrapper in helpers - added support for specified byte-stream encoding in InputSource - added helper classes AttributeListImpl and LocatorImpl - updated much documentation When I put together more documentation (beyond the verbose JavaDoc comments), I will provide an OMG-IDL definition of ByteStream and CharacterStream, which people can use as a guide for creating bindings across different programming languages. The java.io.InputStream and java.io.Reader classes can be mapped to these definitions. Thanks and congratulations to the members of XML-DEV. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Mon Apr 27 10:28:46 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:50 2004 Subject: SAX 1.0beta In-Reply-To: <199804270226.WAA05250@unready.microstar.com> Message-ID: <3.0.1.16.19980427075241.3d4fcc5e@pop3.demon.co.uk> At 22:26 26/04/98 -0400, David Megginson wrote: >I have just uploaded a beta version of SAX 1.0, which you can download Great. Downloading it will be my first act today :-) [...] > >Thanks and congratulations to the members of XML-DEV. Let me be the first to congratulate David. It was a much larger task than it looked at the beginning. It also lays a wide foundation of experience for other activities in XML. Among the things I think we've learnt are: - the value of the open process - the need to have somebody (or some body) at the centre. - the determination to go round the loop many times. - the value of clear goals, even though they may shift - the strength and communal spirit of the XML community Some people have said that SAX is past the 80/20 that we set ourselves at the beginning - and perhaps achieved at the end of David's first month. It seems that expansion has mainly come in non-XML things that are required in a range of interoperable modern programs. Hopefully the strategy and/or details of these can be re-used in other interfaces. From the point of view of an *application-writer* I think that SAX has stuck fairly closely to where we started, and with David's demo examples and documentation is very easy and quick to get started with. I agree completely with no tweaking (except for bug fixes). My classpath is now confused between the various versions I have been playing with. [David - is there a driver for AElfred in this SAX release? I think the problems I raised with you privately were spurious.] David and others have mentioned 'SAX 2'. It might be useful to review what this concept is and what its role would be. I have been very impressed by the community's contribution here and the will to develop open solutions. There is no shortage of other areas where such an effort would be valuable. We've shown we can do it - second time is often easier. I am sure that in this period of rapid XML growth David and his employers have had many other calls on their time and - in general - I would like to thank all those companies which have been able to contribute so publicly to the XML-DEV effort. I hope that is seen to be a worthwhile investment. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Mon Apr 27 10:28:48 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:50 2004 Subject: Open Standards Processes (WAS Re: Nesting XML based languages and scripting languages) In-Reply-To: <199804270103.UAA02407@bruno.techno.com> References: <3541236C.FDF@hiwaay.net> <01bd6f60$4e4db160$d8addccf@uspppBckman> <199804241122.HAA00254@unready.microstar.com> <3541236C.FDF@hiwaay.net> Message-ID: <3.0.1.16.19980427080932.223f2b74@pop3.demon.co.uk> At 20:03 26/04/98 -0500, you wrote: > >This note is about the ISO republic vs. the W3C principality, so just >skip it unless you want to know how happy I am to be at peace with >both of them. Thanks Steve - I think this sums up a good deal of what I feel (though I haven't had many dealings with ISO). I have been fulsome in my praise for the XML/W3C process - admittedly I hadn't labelled it as a vendor consortium partly because of the size of its membership list and partly because a number of members are not vendors. When I compare it with badly-driven secretive monopolistic 'standards' in both horizontal and vertical arenas there is no contest. I think we are always wise to be aware of Faustian bargains, but we are not in that position. The official standards processes are often not fast. I believe it has taken 10 years to agree the chemical symbol for element 104 (Henry will correct me). Luckily this has not been of great technical importance during that time. Much more serious is that in transferring a Chlorine (Cl) from one program to another the atom gets 'converted' to a Carbon (C). Reason - the author of the first program didn't know the 'format' was FORTRAN based and got it one column wrong. So XML will be a marvel if we can get the chemical community to realise it. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From m.mower at unl.ac.uk Mon Apr 27 12:20:44 1998 From: m.mower at unl.ac.uk (Matt Mower) Date: Mon Jun 7 17:00:51 2004 Subject: Advice on XML editor development Message-ID: <35465362.581216906@tara.unl.ac.uk> Hi. I am looking for some advice on possible approaches, useful tools, and helpful resources on developing an XML based editor. Some general requirements for the editor are: 0. It be simple enough to just "sit down and use" 1. It require no knowledge of XML to use 2. It be based entirely pre-prepared DTD's 3. It only allow creation of documents conforming to a DTD 4. It provides help to users converting existing documents* 5. It be written in Java (* this might be as simple as allowing them to copy&paste from another window.) What I generally envisage is a kind of toolbar/wizard oriented editor where the user selects the type of document they want to edit by choosing a DTD. Then at each stage of editing they are presented with only the elements appropriate as children of the currently selected element. A wizard should be able to help them build complex elements & attributes. Basically at every stage of the editing process the document should be valid. To get an idea for the concept I have in here is a "sample editing session". Outline notes appear inside []. --- "I want to write a module outline so I select module.dtd. Fill out the code, title and lecturer attributes. [Now we have a tree with just a root object: module]. Right now I need some learning objectives so I click on learning objects in the task bar [this appears dynamically in the task bar because it can be added to a module]. Click and type in the text. [Learning objectives have no sub-elements so the task bar is empty, it knows it contains CDATA so it auto-magicaly allows the user to type text]. Now I want to add a lecture so I click on the module, then select lecture from the task bar. Add the week number and lecturer attributes. Now I add a learning objective to the lecture [Learning objective appears under the task bar for a lecture element], type in the text. Now I ...." --- I would be grateful for any and all help that anyone in the XML-Dev community can provide. In particular I would be interested in :- 1. helpful resources and/or technologies 2. estimates of how hard this might be to develop 3. existing projects or code 4. people willing to collaborate on such a development Obviously if anyone has a strong opinion that this particular kind of tool is going to be commercially available soon (e.g. Front page, NetObjects Fusion, ...) I would be interested in those as well. Best regards. Matt. -- Matt Mower, Information Systems Team, University of North London T: +44-(0)171-753-3288 F: +44-(0)171-753-5120 E: m.mower@unl.ac.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jamesr at steptwo.com.au Mon Apr 27 12:37:29 1998 From: jamesr at steptwo.com.au (James Robertson) Date: Mon Jun 7 17:00:51 2004 Subject: Advice on XML editor development In-Reply-To: <35465362.581216906@tara.unl.ac.uk> Message-ID: <199804271037.UAA27494@magna.com.au> At 20:05 27/04/1998 , you wrote: | 0. It be simple enough to just "sit down and use" | 1. It require no knowledge of XML to use | 2. It be based entirely pre-prepared DTD's | 3. It only allow creation of documents conforming to a DTD | 4. It provides help to users converting existing documents* Sounds great. | 5. It be written in Java Revised point 5: 5. It is not written in Java. New point 6: 6. It is easy to integrate with RAD development environments like Delphi, C++ builder, and yes, even Visual Basic. No disrespect to all those in the XML community, but the bulk of the world uses tools other than Java. And it would be nice to think that we, too, can make use of XML. Just my $0.02. Regards, James ------------------------- James Robertson Step Two Designs Pty Ltd SGML, XML & HTML Consultancy http://www.steptwo.com.au/ jamesr@steptwo.com.au "Beyond the Idea" ACN 081 019 623 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Mon Apr 27 12:48:57 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:51 2004 Subject: SAX 1.0beta In-Reply-To: <199804270226.WAA05250@unready.microstar.com> Message-ID: <3.0.1.16.19980427104814.3c8f16c4@pop3.demon.co.uk> Having wasted some of DavidM's valuable time I have just (I hope) debugged a problem due to clashing versions of SAX. This may just be a transitional problem (i.e. gone tomorrow when everyone has converted to SAX1.0beta), but it may also need some formal addressing by parser writers. (Nothing here is judgmental!) I have recently been trying to get a distribution for JUMBO2 which interfaces with several parsers (at present AElfred, DXP and Lark, though the limit is simple my time to test others). Having constructed these I discovered weird bugs which bit me late and night and did not give up. I have finally tracked the reason down to the fact that: - DXP distributes a version of org.xml.sax, which is not compatible with the final release SAX10beta. - DXP was ahead of SAX in my classpath. The loader therefore picked up this incompatible version. The question, therefore, is how should parser developers distribute SAX? The DXP strategy is self-contained and therefore useful if someone is just using DXP, but it has this possibility of tricky interactions. Should we encourage packaging of SAX or parallel downloading? Should we have versioning within SAX that is machine-checkable? I am sure there are experts in this field. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marc at kleurbeeld.be Mon Apr 27 13:14:11 1998 From: marc at kleurbeeld.be (Marc Veldman) Date: Mon Jun 7 17:00:51 2004 Subject: Advice on XML editor development Message-ID: <199805010425.NAA04925@kleurbeeld.be> [James Robertson] > 6. It is easy to integrate with RAD development environments > like Delphi, C++ builder, and yes, even Visual Basic. > No disrespect to all those in the XML community, but the bulk > of the world uses tools other than Java. And it would be > nice to think that we, too, can make use of XML. For those of you who might be interested, I am currently doing some work on a tool written in Delphi. It includes a number of XML classes as well as some XML editing tools. Is anyone else doing XML developent in delhi ? Marc Veldman kleur & beeld b.v. - digitaal denken en drukken e-mail: marc@kleurbeeld.be xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tug at wilson.co.uk Mon Apr 27 13:29:23 1998 From: tug at wilson.co.uk (John Wilson) Date: Mon Jun 7 17:00:51 2004 Subject: SAX 1.0beta Message-ID: <002401bd71cf$99616db0$0a01d30a@bach.wilson.co.uk> >The question, therefore, is how should parser developers distribute SAX? >The DXP strategy is self-contained and therefore useful if someone is just >using DXP, but it has this possibility of tricky interactions. Should we >encourage packaging of SAX or parallel downloading? Should we have >versioning within SAX that is machine-checkable? I am sure there are >experts in this field. JDK 1.2 will have features which should help with this: The JAR file manifest has been extended to provide version information and to let packages define the extensions and libraries that are needed and to provide links to JAR files containing the extensions if they are not available locally. So: SAX should be packaged as an extension (this doesn't require and change to the code, just the manifest in the JAR file). Parsers should be packaged so that their manifest shows that SAX is required and, optionally, provide their own version. SAX should have a well defined convention for version numbering and this should be implemented in the JAR file manifest. Parsers should check the version of SAX loaded using java.lang.Package.isCompatibleWith() and complain if the check fails. JDK 1.2 pretty much removes the need for CLASSPATH (and about time). [It's a shame that manifests aren't marked up in XML!] John Wilson The Wilson Partnership 5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK +44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax) Mailto: tug@wilson.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Mon Apr 27 14:39:39 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:51 2004 Subject: Advice on XML editor development In-Reply-To: <35465362.581216906@tara.unl.ac.uk> Message-ID: <3.0.1.16.19980427123854.0d5f0ccc@pop3.demon.co.uk> Thanks very much Matt, At 10:05 27/04/98 +0000, Matt Mower wrote: >Hi. > >I am looking for some advice on possible approaches, useful tools, and >helpful resources on developing an XML based editor. > >Some general requirements for the editor are: > >0. It be simple enough to just "sit down and use" >1. It require no knowledge of XML to use >2. It be based entirely pre-prepared DTD's I take this to mean "you can only use it if you have a DTD, but it will accept any DTD", rather than "a set of DTDs must be hardwired in (as in HTML editors)" >3. It only allow creation of documents conforming to a DTD >4. It provides help to users converting existing documents* >5. It be written in Java There are some additional requirements that I expect you will want very shortly :-). Think of (say) Netscape Composer (which is similar to an SGML/XML editor with a fixed DTD) and ask what additional possibilities would be useful. You may (rightly) reject them as bloating your ideas but you will certainly find other people wanting them :-) It should support tree-based and stream-based editing (my guess is that you were primarily thinking of streams) It should support ID/IDREF and xml:link in a graphical manner (e.g. dragon-drop) It should be possible to embed images (and possibly other non-textual material) It should allow searching ('Find') It should support entity management. [I could add lots more, but I suspect that this gives you an idea of the increasing range.] Note that there is a lot of experience in the SGML (sic) community about creating editors and it's worth talking to experienced people and getting demos to find out exactly what the range of things you really want is. > >(* this might be as simple as allowing them to copy&paste from another >window.) > >What I generally envisage is a kind of toolbar/wizard oriented editor >where the user selects the type of document they want to edit by >choosing a DTD. Then at each stage of editing they are presented with >only the elements appropriate as children of the currently selected >element. A wizard should be able to help them build complex elements & >attributes. Basically at every stage of the editing process the document >should be valid. 'every stage' means that every keystroke in a text window must be checked for validity. It would be impossible to type a start tag without the end tag. So I suspect you will allow some relaxation of this :-) > [...] > >I would be grateful for any and all help that anyone in the XML-Dev >community can provide. In particular I would be interested in :- > >1. helpful resources and/or technologies >2. estimates of how hard this might be to develop >3. existing projects or code >4. people willing to collaborate on such a development > >Obviously if anyone has a strong opinion that this particular kind of >tool is going to be commercially available soon (e.g. Front page, >NetObjects Fusion, ...) I would be interested in those as well. If you are simply interested in getting a usable editor (i.e. your documents are more important than the tools to create them), it may be worth waiting. If you are impatient, or have an application (like I do) that requires a bespoke editor, read on. 1. The com.sun.java.swing class library is undoubtedly the single most powerful Java resource that I know about. It's freely available, the source is open and it is being used by thousands. The only downside is that it is so comprehensive that it takes a lot of effort to get to grips with. This is mainly because developing editors is much tougher than you might think :-) As an example there are nearly 50 classes in the swing.tree package and a similar number in swing.text. I am almost certain that everything you need to create a high-quality XML browser/editor is there (except, of course, the parser/API :-). I don't know whether it's a spinoff from hotjava, but the text classes pay homage to SGML as their inspiration. 2. To do a full job, starting from scratch is hard. To do a partial job ... depends how partial. BUT... 3/4. I have rewritten JUMBO2 to use Swing wherever possible. I am making the source public (in the next few days). I would be delighted for this to act as a seed in a communal project. JUMBO2 uses SAX to choose parsers, uses JTree/DefaultMutableTreeNode to give a Tree-based display, uses JTable for attribute display and editing, uses DefaultStyledDocument for the text, etc. The main things I am concentrating on now are: how to apply namespaces; how to apply simple style interactively; how to validate partially formed documents (I will do this with a SAX1.0-compliant validating parser when one appears). There is a lot to be gained by doing this collaboratively. For example, I have encountered a few buglets I can't track down (e.g. icons don't display reliably). I'd really like to concentrate on the XML stuff, while I'm sure there are Swing experts out there who would take a fraction of the time. P. >Matt Mower, Information Systems Team, University of North London >T: +44-(0)171-753-3288 F: +44-(0)171-753-5120 E: m.mower@unl.ac.uk If you are interested in meeting in a pub, I live in London. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Apr 27 15:11:18 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:51 2004 Subject: SAX Version Conflicts (was Re: SAX 1.0beta) In-Reply-To: <3.0.1.16.19980427104814.3c8f16c4@pop3.demon.co.uk> References: <199804270226.WAA05250@unready.microstar.com> <3.0.1.16.19980427104814.3c8f16c4@pop3.demon.co.uk> Message-ID: <199804271306.JAA00333@unready.microstar.com> Peter Murray-Rust writes: > I have recently been trying to get a distribution for JUMBO2 which > interfaces with several parsers (at present AElfred, DXP and Lark, though > the limit is simple my time to test others). Having constructed these I > discovered weird bugs which bit me late and night and did not give up. I > have finally tracked the reason down to the fact that: > - DXP distributes a version of org.xml.sax, which is not compatible with > the final release SAX10beta. > - DXP was ahead of SAX in my classpath. > The loader therefore picked up this incompatible version. This is the problem that I had planned to avoid by prefixing all of the class names with "SAX...", but which, as people rightly pointed out, is a very temporary one that shouldn't leave a permanent mark on SAX. The people at DataChannel have been very supportive of and helpful with SAX, and I expect that this problem will vanish with a new DXP release not too long after SAX moves out of beta (the latter event will happen sometime this week, once everyone has had a chance to submit bug reports or documentation corrections). For the next couple of weeks, until we have new versions of AElfred, DXP, XP, and IBM's XML for Java with SAX 1.0 drivers, it might make sense to maintain two separate CLASSPATH's. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at step.de Mon Apr 27 15:12:56 1998 From: larsga at step.de (Lars Marius Garshol) Date: Mon Jun 7 17:00:51 2004 Subject: SAX 1.0beta References: <002401bd71cf$99616db0$0a01d30a@bach.wilson.co.uk> Message-ID: <354482EB.AA560A6E@step.de> John Wilson wrote: > > SAX should have a well defined convention for version numbering Agreed. > and this should be implemented in the JAR file manifest. That's perhaps not the best way, since it is so Java-specific. I've already proposed a different solution[1], which added two methods to SAX and two to the parser interface, but there were no responses (except one from David which never reached the list) and it hasn't appeared in SAX 1.0, so I guess that means it won't be in 1.0. --Lars M. [1] http://www.lists.ic.ac.uk/hypermail/xml-dev/9803/0168.html (The proposal has some ParserFactory stuff as well, but that's really a separate issue. Also, the names are bad and should be changed.) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jmulet at datalab.es Mon Apr 27 15:19:15 1998 From: jmulet at datalab.es (Jordi Mulet) Date: Mon Jun 7 17:00:51 2004 Subject: Advice on XML editor development Message-ID: <01bd71de$62eb2d20$67ac92c0@jordi.praxis.es> Hello, We have also planned to develop a Delphi Client/server application to manage SGML/XML documents and the links between diferent SGML/XML documents. Any informationa about this development will be very apreciated! Thanks, -----Original Message----- De: Marc Veldman Para: xml-dev@ic.ac.uk Fecha: lunes 27 de abril de 1998 13:05 Asunto: Re: Advice on XML editor development >[James Robertson] > >> 6. It is easy to integrate with RAD development environments >> like Delphi, C++ builder, and yes, even Visual Basic. > >> No disrespect to all those in the XML community, but the bulk >> of the world uses tools other than Java. And it would be >> nice to think that we, too, can make use of XML. > >For those of you who might be interested, I am currently doing some work on >a tool written in Delphi. It includes a number of XML classes as well as some >XML editing tools. Is anyone else doing XML developent in delhi ? > >Marc Veldman > >kleur & beeld b.v. - digitaal denken en drukken >e-mail: marc@kleurbeeld.be > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tug at wilson.co.uk Mon Apr 27 15:27:57 1998 From: tug at wilson.co.uk (John Wilson) Date: Mon Jun 7 17:00:51 2004 Subject: SAX 1.0beta Message-ID: <004b01bd71e0$263f2f50$0a01d30a@bach.wilson.co.uk> Lars Marius Garshol wrote: >> and this should be implemented in the JAR file manifest. > >That's perhaps not the best way, since it is so Java-specific. Yes, though none of this would preclude adding explicit version enquiry methods to the SAX interface. Consider this as a proposal for how the Java version might be packaged. Java parsers could safely use java.lang.Package.isCompatibleWith() as they _know_ that they are written in Java and are talking to a Java package. John Wilson The Wilson Partnership 5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK +44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax) Mailto: tug@wilson.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Mon Apr 27 15:42:03 1998 From: jtauber at jtauber.com (James K. Tauber) Date: Mon Jun 7 17:00:51 2004 Subject: Advice on XML editor development In-Reply-To: <3.0.1.16.19980427123854.0d5f0ccc@pop3.demon.co.uk> Message-ID: <002001bd71e1$23a2e420$d46118cb@caleb> On this issue of XML editor development, what work has been done on style (behaviour) sheets for editing? I don't just mean display information for use in near-WYSIWYG editors but a generic editing framework that reads in an "editing sheet" that turns it into an application-specific editor. What I am envisaging is a generic editor that given an editing sheet for MathML would become an equation editor or given an editing sheet CML would enable editing of molecular structure diagrams. Of course, such editing sheets would include code but that code would presumable have a lot of overlap with the code attached to XSL stylesheets for display. I would imagine Java Classes, for example with both display and editing interfaces for this purpose. Is anyone working on this sort of thing? James -- James Tauber / jtauber@jtauber.com Perth, Western Australia XML Pages: http://www.jtauber.com/xml/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at dvs1.informatik.tu-darmstadt.de Mon Apr 27 15:54:21 1998 From: rbourret at dvs1.informatik.tu-darmstadt.de (Ron Bourret) Date: Mon Jun 7 17:00:51 2004 Subject: Inheritance in XML Message-ID: <199804271251.OAA05859@berlin.dvs1.tu-darmstadt.de> My thanks to everyone who participated in the semantics and XML-DEVIL posts for a most amusing Monday morning. I get the list in digest form, so you can only imagine the cumulative impact of reading this all at once... I'd like to answer a question Matthew Gertner asked last Monday. (That was a week ago. Remember? Before Noam Chomsky and the lawyers from Chicago?) > Having given this some more thought, I don't see any practical way to insert > new content in the middle of an existing content model. Maybe someone > cleverer than I has an idea about how this might be done (and whether it is > really useful). In the meantime, one useful approach might be to at least > enable new content to be added at the beginning of the base content model by > adding a #BASECONTENT keyword which is replaced by the base content model in > the derived element type description: Extending an existing DTD by inserting new content in the middle is definitely useful. Imagine that I had a simple DTD for books: ... Now I want to add a list of figures after the table of contents and before the chapters: I can't easily do this without insertion. Note that a very simple way to do this is for my derived document to simply redefine the Book element. In effect, the two (or more) DTDs are simply merged before normal processing, with the precedence for multiply defined terms going to definitions in the derived document. Although this probably doesn't satisfy a lot of requirements for polymorphism, etc., it does satisfy Matthew's requirement in a later post, which I agree would greatly increase the applicability of XML. > All I want is to be able to do is scoot over to the DTD repository site, > check for a standard DTD for invoices, grab it, extend it with the two or > three extra attributes and/or contained element types that I need and use it, > while still being able to use any tools that are designed to work with the > original invoice DTD. I truly believe that this is where XML will really > start to fulfill its promise. It is easy to imagine a large number of simple, useful DTDs: date/time, measurement, word definitions, travel itineraries, bibliographies, etc. Has anybody working on the namespaces spec considered this sort of functionality? It strikes me as being very close. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Mon Apr 27 16:04:16 1998 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:00:51 2004 Subject: xml-devil group. was open standards Message-ID: <01bd71fe$c5fbfe60$ccaddccf@uspppBckman> I promise that this will be the last mailing to the list from me on this subject!! I recently sent out an e-mail to all those who I thought had an interest in the xml-devil proposal. If you did not recieve an e-mail from me and are interested, please let me know. Frank Boumphrey. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marc at kleurbeeld.be Mon Apr 27 16:07:05 1998 From: marc at kleurbeeld.be (Marc Veldman) Date: Mon Jun 7 17:00:51 2004 Subject: Advice on XML editor development Message-ID: <199805010716.QAA06120@kleurbeeld.be> [Jordi Mulet wrote:] > We have also planned to develop a Delphi Client/server application to > manage > SGML/XML documents and the links between diferent SGML/XML documents. > Any informationa about this development will be very apreciated! I'm writing a number of classes to parse XML documents as well as a number of classes to get data from a document given an element type and atttibute values. The idea is to produce a front-end for XML documents so that you can access them much like a database. A second thing I'm working on is a simlple XML editor. I don't feel ready yet to show the source code to anyone, (I will have to do some cleaning up & commenting, and some serious debugging) But if you'd like to exchange ideas and comments, we'll keep in touch. I will mail you as soon as I get some source ready. Have a nice day, Marc Veldman kleur & beeld b.v. - digitaal denken en drukken e-mail: marc@kleurbeeld.be xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at opengroup.org Mon Apr 27 16:12:54 1998 From: b.laforge at opengroup.org (Bill la Forge) Date: Mon Jun 7 17:00:52 2004 Subject: Advice on XML editor development In-Reply-To: <01bd71de$62eb2d20$67ac92c0@jordi.praxis.es> Message-ID: <3.0.3.32.19980427101154.00982be0@postman.osf.org> My needs are somewhat different. I've written an extensible XML processor, coins, and would like to turn it into a useful tool. The current parser is quite brain dead, but then I wrote the silly thing in one weekend: http://www.camb.opengroup.org/~laforge/coins/src/ORG/opengroup/coins/CoinLoader.java I'm looking for a parser I can adopt to my needs or code I can crib. Specifically, a non-validating Java parser. (Speed is primary here, and validation is done by the generated marshalling code anyway.) But a real complicating factor is that I need to be able to parse the XML, modify the parse tree, and then recreate XML for output. To date I've ignored entities, as they could really complicate this process. All the work I've done so far is pretty much unencumbered, requiring only that the copyright notice be retained--even for commercial applications. So any code I use needs to be similarly unencumbered. Alternatively, I'd be glad to work with someone who has an interest in parsing. For me, it is more of an obstacle than an opportunity. Thanks! Bill la Forge Sr. Research Engineer Research Institute of The Open Group xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From m.mower at unl.ac.uk Mon Apr 27 16:31:57 1998 From: m.mower at unl.ac.uk (Matt Mower) Date: Mon Jun 7 17:00:52 2004 Subject: Advice on XML editor development In-Reply-To: <3.0.1.16.19980427123854.0d5f0ccc@pop3.demon.co.uk> References: <3.0.1.16.19980427123854.0d5f0ccc@pop3.demon.co.uk> Message-ID: <354c7f30.592430000@tara.unl.ac.uk> On Mon, 27 Apr 1998 12:38:54 +0000, Peter Murray-Rust wrote: >>2. It be based entirely pre-prepared DTD's > >I take this to mean "you can only use it if you have a DTD, but it will >accept any DTD", rather than "a set of DTDs must be hardwired in (as in >HTML editors)" > Yes that is exactly what I meant. > It should support tree-based and stream-based editing (my guess is that >you were primarily thinking of streams) > I'm not familiar enough with the terminology here. I'll tell you what I think the difference is. A tree based editor is something like XMLPro which displays a document as a hierarchy of objects, a stream based editor is something like composer where you type free-text (albeit in a structured manner). If the above is accurate then my intention is for this to be a tree-based editor. With DTD elements acting as "objects" that can be manipulated sensibly. > It should support ID/IDREF and xml:link in a graphical manner (e.g. >dragon-drop) > I'm not sure what this. > It should be possible to embed images (and possibly other non-textual >material) > Yes in some form or other. > It should allow searching ('Find') > Certainly. > It should support entity management. > That's probably less of an issue for us (in our specific intention at the moment) but I would accept it in general. >Note that there is a lot of experience in the SGML (sic) community about >creating editors and it's worth talking to experienced people and getting >demos to find out exactly what the range of things you really want is. >> Can you suggest a forum in which I could raise this issue? >'every stage' means that every keystroke in a text window must be checked >for validity. It would be impossible to type a start tag without the end >tag. So I suspect you will allow some relaxation of this :-) > No tag typing. The idea is that when an element is selected a tool bar will offer all valid sub-elements. Clicking the button creates the element and offers a wizard for filling in attributes and so forth. I really don't want users to know that XML is going on in the background. >1. The com.sun.java.swing class library is undoubtedly the single most >powerful Java resource that I know about. It's freely available, the source > I have reliability concerns about swing. In principle though I do agree with you. Best regards. Matt. -- Matt Mower, Information Systems Team, University of North London T: +44-(0)171-753-3288 F: +44-(0)171-753-5120 E: m.mower@unl.ac.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matthew at praxis.cz Mon Apr 27 17:15:49 1998 From: matthew at praxis.cz (Matthew Gertner) Date: Mon Jun 7 17:00:52 2004 Subject: Fw: And now what? (was Re: Inheritance in XML) Message-ID: <01bd71ef$5916c9b0$020b0ac0@xerius> Oops, meant to post this to the list... -----Original Message----- From: Matthew Gertner To: Peter Murray-Rust Date: Monday, April 27, 1998 3:33 PM Subject: And now what? (was Re: Inheritance in XML) >Peter, > >The direction you are suggesting seems very promising. One other area worth >considering would be a system for the classification and retrieval of >existing DTDs. The might include some kind of repository technology >(anything from the file system to an OODBMS), a DTD for DTDs (related to >your first point), an indexing engine and, my favorite, a taxonomy for DTDs >a la Yahoo. The latter would enable a user to "drill down" in a hierarchy to >find the DTD s/he needs. > >I remember reading about something along these lines on the list a while >back. (I have the name "TagNet" in the back of my mind.) Maybe someone can >refresh my memory. Anyway, since you are talking about actually sitting down >and developing something of communal use, I would certainly be interested in >becoming involved in this sort of effort. > >>- linking ELEMENTs to software (i.e. behaviour). This can either be done on >>an implicit basis (e.g. CML:MOL links to jumbo.cml.MOLNode probably through >>xml:namespace pointers) or through the stylesheet syntax (which we await - >>I very much hope it has the ability to embed Java methods). >> Note that *I* would often classify this as 'semantics' because it is often >>easier to define technical operations in terms of machine-based rules or >>specifications rather than prose. As an example, 'electronegativity' might >>be defined by an algorithm rather than prose. > >This is very cool. A JavaBean/COM object/whatever wrapper that exports XML >attributes as getter/setter methods would also be quite neat, since it would >provide plug and play integration into visual RAD environments for XML data >sources. This could be implemented as a base interface (XMLElementType) from >which a specific class could be derived for a given element type. Does >anything like this exist? > >Cheers, > >Matthew > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Mon Apr 27 17:45:17 1998 From: jtauber at jtauber.com (James K. Tauber) Date: Mon Jun 7 17:00:52 2004 Subject: And now what? (was Re: Inheritance in XML) In-Reply-To: <01bd71ef$5916c9b0$020b0ac0@xerius> Message-ID: <003001bd71f2$bb39e5c0$d46118cb@caleb> [Matthew Gertner] > The direction you are suggesting seems very promising. One > other area worth considering would be a system for the classification and retrieval > of existing DTDs. The might include some kind of repository technology > (anything from the file system to an OODBMS), a DTD for DTDs > (related to your first point), an indexing engine and, my favorite, a > taxonomy for DTDs a la Yahoo. The latter would enable a user to "drill down" > in a hierarchy to find the DTD s/he needs. I have been wanting to do something like this for a while as a follow on from the document type list on my XML site. I think I might have said so in a previous post a while back, I certainly mentioned it to PMR. There is an appallingly simple taxonomy at my site and no direct links to the DTDs themselves (although I plan to change that this week). I'd like to spin off that part of my XML site and have registered a domain name for that purpose (although there isn't a site there yet). Let's do this! James -- James Tauber / jtauber@jtauber.com Perth, Western Australia XML Pages: http://www.jtauber.com/xml/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Apr 27 18:16:08 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:52 2004 Subject: Advice on XML editor development Message-ID: <3.0.32.19980427090911.00b0a920@pop.intergate.bc.ca> At 10:11 AM 4/27/98 -0400, Bill la Forge wrote: >I'm looking for a parser I can adopt to my needs or code I can crib. Specifically, a non-validating Java parser. (Speed is primary here, and >validation is done by the generated marshalling code anyway.) But a real >complicating factor is that I need to be able to parse the XML, modify the >parse tree, and then recreate XML for output. To date I've ignored entities, >as they could really complicate this process. Several of the Java parsers either build a tree directly or make it easy to do so. Given a tree, the amount of code necessary to construct an XML output instance is like 25 lines, on a bad day. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Mon Apr 27 19:09:24 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:00:52 2004 Subject: Final alpha release of XED: lots of new features Message-ID: <002a01bd71ff$3bf2a9c0$1e09e391@mhklaptop.bra01.icl.co.uk> >The final (I hope) alpha release (0.3.1) of XED is now >available for evaluation: 1. I downloaded OK. 2. I tried to find the installation instructions and failed. 3. I tried to guess the installation instructions and failed again. 4. I gave up. Mike Kay (using Win95, and probably being very stupid...) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at opengroup.org Mon Apr 27 19:22:03 1998 From: b.laforge at opengroup.org (Bill la Forge) Date: Mon Jun 7 17:00:52 2004 Subject: Advice on XML editor development In-Reply-To: <3.0.32.19980427090911.00b0a920@pop.intergate.bc.ca> Message-ID: <3.0.3.32.19980427132306.009a1650@postman.osf.org> At 09:09 AM 4/27/98 -0700, Tim Bray wrote: >Several of the Java parsers either build a tree directly or make it >easy to do so. Given a tree, the amount of code necessary to >construct an XML output instance is like 25 lines, on a bad day. > -T. But what about entities? If they are expanded when the XML is parsed, things get a little complicated when the parse tree is converted back to XML, unless there is no attempt at restoring the original entities. Now I currently track which nodes have been modified (dirty), where a node is counted as modified if any of its children have been modified. --I use the original input for output when a node is unmodified. I would need to track the pre-expanded entity input for each node. But lets say an entity expands to XML which becomes two of the children of a common node. The problem now is if either child is modified, the subsequent XML would not be able to reference the original entity. (Yes, this probably turns into the answer I want, but I suspect there is a lot more work here. Particularly since I currently have no code for expanding the entities in the first place.) Please feel free to look at what I've done. The zip file can be found at http://www.camb.opengroup.org/~laforge/coins/#related_links But as I said before, the bulk of that has to do with the extensible processor, and only the small CoinLoader class actually does any parcing. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Mon Apr 27 19:33:21 1998 From: ht at cogsci.ed.ac.uk (Henry S. Thompson) Date: Mon Jun 7 17:00:52 2004 Subject: Final alpha release of XED: lots of new features In-Reply-To: "Michael Kay"'s message of Mon, 27 Apr 1998 18:09:26 +0100 References: <002a01bd71ff$3bf2a9c0$1e09e391@mhklaptop.bra01.icl.co.uk> Message-ID: "Michael Kay" writes: > 1. I downloaded OK. > 2. I tried to find the installation instructions and failed. > 3. I tried to guess the installation instructions and failed > again. There are no installation instructions because all you have to do is unpack the zip file and run xed.exe. If you tried that and failed, I need to know the nature of the failure. If you guessed something else, let me know so I can forestall the misunderstanding for others. Sorry to have wasted your time, hope this helps, ht -- Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk URL: http://www.ltg.ed.ac.uk/~ht/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jarle.stabell at dokpro.uio.no Mon Apr 27 19:37:16 1998 From: jarle.stabell at dokpro.uio.no (Jarle Stabell) Date: Mon Jun 7 17:00:52 2004 Subject: XML and Delphi (was: Advice on XML editor development) Message-ID: <3.0.32.19980427193700.009a4860@hedvig.uio.no> Marc Veldman wrote: >For those of you who might be interested, I am currently doing some work on >a tool written in Delphi. It includes a number of XML classes as well as some >XML editing tools. Is anyone else doing XML developent in delhi ? My company has used a *very* simple subset of SGML for some Delphi written applications. We will migrate to XML in the future, or at least as close/much as we practically can get. (Due to the specific nature of one of our applications, 99% of the users of this app would probably hate case-sensitivity, so we will probably implement a flag to indicate whether to enforce this or not. This app may also benefit quite a lot from the illegal short end tag (), so we will implement a flag allowing this if our users want it. (We will of course implement an XML exporter/"cleaner", which will "clean up"/remove our non-XML compliant stuff)) We need something like Sax Level 2 for Delphi, as we need to update/edit XML documents programmatically. (I haven't read the latest/final XML 1.0 spec yet, but I suspect that entity references represent the most "hairy" aspect about programmatic updating of XML documents?) Cheers, Jarle Stabell Digital Logikk AS xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrew at epiphanysoftware.com Mon Apr 27 19:43:39 1998 From: andrew at epiphanysoftware.com (Andrew Cogan) Date: Mon Jun 7 17:00:52 2004 Subject: Advice on XML editor development References: <199804271037.UAA27494@magna.com.au> Message-ID: <3544C383.F29878D3@epiphanysoftware.com> James Robertson wrote: [snip] > New point 6: > > 6. It is easy to integrate with RAD development environments > like Delphi, C++ builder, and yes, even Visual Basic. > > No disrespect to all those in the XML community, but the bulk > of the world uses tools other than Java. And it would be > nice to think that we, too, can make use of XML. I agree. At this point, Java is very high profile but makes up only a tiny percentage of commercial projects. One thing that occurred to me as a possible basis for creating an XML editor is the new Microsoft DHTML editor ActiveX control. It's a prerelease right now, but as it has the full HTML, DHTML, and scripting capabilities of Internet Explorer, plus (I think) access to Microsoft's XML parser, it seems like it might be a good foundation. I haven't used it yet myself, so I can't say from personal experience if it's suitable. See for yourself at http://www.microsoft.com/workshop/author/dhtml/edit/. -- Andrew Cogan, Epiphany Software xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From David.Brownell at Eng.Sun.COM Mon Apr 27 19:47:19 1998 From: David.Brownell at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:00:52 2004 Subject: SAX Version Conflicts (was Re: SAX 1.0beta) Message-ID: <199804271745.KAA17177@argon.eng.sun.com> Versioning of APIs is in general a problem that few platforms claim to have solved. I think SAX is basically OK, since the January version wasn't to be the "1.0" release. The classic strategy is: - APIs in development can have incompatible changes. (SAX to date has been in development.) - Once they are "final", all changes must be compatible. (From now on ... ) For Java, see chapter 13 of the Java Language Specification: http://java.sun.com/docs/books/jls/html/13.doc.html That doesn't address the JDK 1.2 package versioning work, but it does give some background that may be useful for non-Java systems. It also doesn't address the compatibility with respect to serialized objects; a parser API should have no need to worry about that stuff though! I like the idea of never using serialization; instead, it's safer to use Externalization, since developers must explicitly think about such data interfaces. (Hmm, data interfaces ... "XML"? ;-) - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lauren at sqwest.bc.ca Mon Apr 27 19:58:04 1998 From: lauren at sqwest.bc.ca (Lauren Wood) Date: Mon Jun 7 17:00:52 2004 Subject: 1998-04-20 Pre-Release, with Road Map and Demos In-Reply-To: <014b01bd6e89$a52badc0$2ee044c6@donpark> Message-ID: At 23/04/98 12:30 AM , Don Park wrote: >>I think that the chair of the DOM WG, Lauren Wood, reads this list; >>perhaps she can comment or correct any mistake that I might have made >>here. Yes I do read this list; however it's probably best to not rely on my doing so. If you have comments on the DOM, please send them to the mailing list at www-dom@w3.org. You do have to subscribe first as a spam-prevention measure; send email to www-dom-request@w3.org with the subject line "subscribe". Since I try to be as agnostic as I can with respect to decisions made by the group, sending email to www-dom is the best thing to do. Almost all of the WG members are on that list, and everything is read, even if not immediately answered. We'd welcome your comments - that's why we write public drafts, so we can get comments as to feasibility, relevant experience, etc. If you think some of the decisions made won't work, please let us know. And not only in xml-dev, since not everyone on the DOM WG will necessarily read it. cheers, Lauren xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From michael at textscience.com Mon Apr 27 20:27:34 1998 From: michael at textscience.com (Michael Leventhal) Date: Mon Jun 7 17:00:52 2004 Subject: Final alpha release of XED: lots of new features References: <002a01bd71ff$3bf2a9c0$1e09e391@mhklaptop.bra01.icl.co.uk> Message-ID: <3544CB4E.234364E@shell1.aimnet.com> Henry S. Thompson wrote: > There are no installation instructions because all you have to > do is unpack the zip file and run xed.exe. If you tried that and Well I got past this part o.k. although a five word README would be a worthwhile addition. If you feel ambitious a five sentence README on how the editor itself works would help. In particular I haven't been able to figure out what 'doesn't read the DTD in detail' might mean. My working hypothesis at the moment is that not reading it in detail means that it is smart enough to skip the internal subset and that the context-sensitive insertion help is gleaned entirely from the instance structure. It also appears that instance-gleaned hierarchy is kept in memory until explictly cleared. If so one should be able to get full context- sensitive insertion help equivalent to the parsing of a DTD by reading in an instance or instances which 'exercise' the entire DTD before opening a new instance. Am I close here? Cheers, Michael Leventhal xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Mon Apr 27 20:51:50 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:00:52 2004 Subject: XML-Data, "&" and inheritance Message-ID: <005201bd720d$c5d93650$840b4ccb@NT.JELLIFFE.COM.AU> From: Paul Prescod >In reviewing XML Data for another project, I note that the XML Data >"subclass" mechanism depends on the XML-Data equivalent of the ampersand >operator that was removed from XML. I'm not convinced that putting that >operator back in was a good idea. It was left ouf of XML because it >complicates implementation. The ampersand operator also complicates processing, if you use some stream processing language like Perl. It means that all the contents of the element have to be read into memory, in the worst case, before they can be processed. The programmer cannot rely on the sequence of the input elements. So developers who are serializing their databases are probably much better off to use fixed sequences (i.e., "," in XML content models) anyway, if they want their data to be processed by text processing applications. But I think XML-data (or any successor) should be free to have any extensions to XML or SGML markup declarations. The more extras that XML-data (or whatever it becomes) can provide, the more reason to justify it. But I would hope that any W3C XML-schema proposal, perhaps reconciling XML-data, RDF-schema and XML markup declarations, would include definite rules for translating between element type definitions using * current markup declaration syntax, * element syntax, for hypertexted declarations, and * PIs, for arbitrary inline declarations. Also, I cannot see why XML-data does not use regular expression syntax for specifying content models. I can see many good reasons for using elements for defining types, but not for using elements for each part of a content model. I note that the OmniMark program SGML2DTD, which converts DTDs into a form quite similar to XML-data converts a certain company's (XXX) version of DOCBOOK DTD from 178K to about 600K. This seems an enormous overhead. If this is so, then XML-data schemas (as they are now) are really only suitable for smaller DTDs (i.e, those belonging to databases), for web use. In any case, a W3C XML-schema proposal should have clear mappings for converting to and from XML declarations. This will make more explicit all the wonderful benefits being gained. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Apr 27 21:13:09 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:52 2004 Subject: SAX: 1.0beta _with_ JavaDoc Message-ID: <199804271908.PAA00579@unready.microstar.com> Thanks to Trong Nguyen for pointing out that my 1.0beta distribution accidentally omitted the generated JavaDoc documentation. It's in there now, if anyone needs it (same URL): http://www.microstar.com/XML/SAX/New/ All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Mon Apr 27 22:24:03 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:00:53 2004 Subject: Open Standards Processes Message-ID: <00d901bd721a$a82bc980$840b4ccb@NT.JELLIFFE.COM.AU> If an author writes a book based on drafts, they are taking a gamble. Those of us who bought "The Draft Standard C++ Library" by Plauger will understand this. I think a responsible author will delay publication until it is clear what the eventual standard will be--we delayed my book "The XML and SGML Cookbook" (now on the presses) for this reason, and I don't think that we should cry too much for authors who put out books based on drafts. (I talked to Sharon Adler at WWW7, and she gave me the impression that XSL would look almost nothing like the draft. Certainly she was insistant that it moved a lot beyond DSSSL.) I think Jon Bosak's idea of allowing writers to monitor (if not participate in) groups is great. A technology like XML needs to be promoted if it will be popular. People have been saying that the W3C process is not open. I was invited to join the SIG as an outside expert, and not as a spokeman for any group. This was exactly the same reason I was invited to participate in the ISO WG4, which looks after SGML. W3C's process is formally closed, but often, in practise slightly open. ISO's practise, in my experience, is that it is formally open, but in practise slightly closed. This is because you (i.e. the technical public) choose not to participate. You do not go to ANSI meetings, or whatever your national body is, at least not in sufficient numbers. This means that ISO groups are largely populated with national representatives from large companies or specialist companies, with a smattering of independent consultants who use ISO to make sure they really understand their subject, and then a handful of special interest groups (American Mathematical Association, UK SGML Users' Group). But ultimately the ISO WG4 vote comes down to national lines. So if US has 100 people they share 1 vote; Australia has 1 person (me), so I control 1 vote. Now this disparity acts to stop a total control of ISO standards by large corporations (in the recent Java standards dispute, Microsoft and Sun had it pointed out to them that it was not enough to win the US vote: if they did not participate in national standards bodies they could not expect to be held in high regard when speaking for or against standards: the attitude that seemed to come from some representations that non-US bodies simply rubberstamp ANSI was naive, if not repellent). But it means that, to some extent, even ISO standards can be skewed by individuals: this skew can only really operate when the indivuals from the peripheral countries vote NO (we cannot force a bad technology to be standardized, but we can block what we think is a bad one.) Of course, I consult with other SGML and XML stakeholders here. And anyone else is free to participate, if they have some minimum provable expertise, according to each national standards body's particular rules. Australia has a policy of trying to represent the legitimate needs of New Zealand (this is a formal arrangement) and our neighbours in the Asia/Pacific (an informal, good neighbour policy.) (I note that the IETF procedure, to pick another "standards"-making process, has failed over many years to address the "selective ACK" problem in TCP/IP, which makes some kinds of Internet traffic between peripheral countries like Australia very difficult: this problem would have been quickly put high on the agenda if ISO procedures were followed. People outside the US should be very suspicious of "standards" bodies which do not have a guaranteed nation-based review policy: we will end up with "center-periphery" technology rather than "world wide" technology. Now that Fuji-Xerox and Keio are participating in W3C, there is a little bit of this in W3C procedures now, but perhaps the center has just expanded a little.) Now that ISO WG4 has started to conduct business by email more, it will increasingly look like the XML SIG. The W3C might be a benevolent dictatorship, but really I think it is a mistake to believe that ISO WG4 is much more open than W3C was: especially since almost all the active participants in WG4 are also on the XML SIG or even one of the Working Groups. In both cases, they can only use whatever experts are available. For markup languages, there are not a real lot. And I (and I am sure many other newcomers to the XML SIG) found that many issues being discussed are so complicated that it would hinder progress if issues had to be re-discussed every time a new person came along: to be actively involved requires that you study up on what has already been done, as much as possible. The big issue is not "openness" in my mind, but "anti-hijacking-ness". Is there some way which a large player could, by legitimate means, create a form of XML/SGML which, in practise, they controlled? This would bring back proprietory data formats. This is of course the alarm bell that rings dully with XML-data, but it also tolls for RDF. Any standard which does not specify enough will get proprietory extensions to fill the gaps. So "openness" in the standards-making process does not guarantee openness in the eventual standard. XML is not a standard. It is a specification--of a subset of SGML, with some bells and whistles. Any standards-*proposing* process cannot be democratic or fully open: communication, language and factionality will prevent that. So a W3C technology, or a Sun one, is IMHO just as good a candidate to be proposed as a standard as a technology dreamed up by ISO committee representatives But where ISO has an advantage is that the standards-*voting* procedure is more open: most national standards bodies are democratic or consensual, and the variety of their membership, and the ease with which representatives of small companies and individuals can participate (if they bothered) does give ISO a credibility which W3C does not have (and does not pretend to have, as far as I have seen.) Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Mon Apr 27 22:58:13 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:53 2004 Subject: Advice on XML editor development In-Reply-To: <354c7f30.592430000@tara.unl.ac.uk> References: <3.0.1.16.19980427123854.0d5f0ccc@pop3.demon.co.uk> <3.0.1.16.19980427123854.0d5f0ccc@pop3.demon.co.uk> Message-ID: <3.0.1.16.19980427203725.218f36a4@pop3.demon.co.uk> At 14:16 27/04/98 +0000, Matt Mower wrote: [...] >I'm not familiar enough with the terminology here. I'll tell you what I >think the difference is. A tree based editor is something like XMLPro >which displays a document as a hierarchy of objects, a stream based >editor is something like composer where you type free-text (albeit in a >structured manner). Yes - unless someone corrects me :-) >If the above is accurate then my intention is for this to be a >tree-based editor. With DTD elements acting as "objects" that can be >manipulated sensibly. Fine. The exciting thing about XML is that this is impossible in HTML and so the tree-based philosophy opens up completely new worlds. > >> It should support ID/IDREF and xml:link in a graphical manner (e.g. >>dragon-drop) >> > >I'm not sure what this. An IDREF attribute points to a unique ID attribute in the same document. Essentially a link. Look at the source of rec.xml - some examples there. xml:link allows hyperlinks. How are you going to introduce them into the document? > [...] >>Note that there is a lot of experience in the SGML (sic) community about >>creating editors and it's worth talking to experienced people and getting >>demos to find out exactly what the range of things you really want is. >>> > >Can you suggest a forum in which I could raise this issue? Here. I have :-). Creating an XML editor is absolutely mainstream for XML-DEV. The point is that many people reading this list will have experience. [...] > >I have reliability concerns about swing. In principle though I do agree >with you. I have bugs. If that means the same thing... P. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Mon Apr 27 23:06:54 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:53 2004 Subject: Advice on XML editor development In-Reply-To: <002001bd71e1$23a2e420$d46118cb@caleb> References: <3.0.1.16.19980427123854.0d5f0ccc@pop3.demon.co.uk> Message-ID: <3.0.1.16.19980427203651.218f38d2@pop3.demon.co.uk> At 21:30 27/04/98 +0800, James K. Tauber wrote: > >What I am envisaging is a generic editor that given an editing sheet for >MathML would become an equation editor or given an editing sheet CML would >enable editing of molecular structure diagrams. Of course, such editing >sheets would include code but that code would presumable have a lot of >overlap with the code attached to XSL stylesheets for display. I would >imagine Java Classes, for example with both display and editing interfaces >for this purpose. I have thought about this a lot and have been working on about the third version for CML support. Say we are working at a node-centered level - e.g. the DOM/JUMBO or whatever knows about ... but there is no point in either a tree- or eventstream display or editing of the children. We click on and might expect the following types of functionality: - display() // to anywhere , probably a new JFrame - getDisplayComponent(boolean editable) // returns a JComponent which // can be embedded in a TabbedPane, Table, etc. - highlight(String foreignAddress); highlights a subcomponent of the object (e.g. an atom in a molecule) - drawToGraphics(Graphics g, Scaler s); // draw onto an exiting graphics so that many objects can be rendered. Scaler is a GKS-like scaling (I expect Java2D will supersede that) This could be very general - it would allow a molecule and a maths eqn to be draw to the same surface, either to be edited, parts of them to be highlighted, etc. I am reasonably confident this will work - if others are interested in following this discussion I'd be delighted. P. > >Is anyone working on this sort of thing? > >James > >-- >James Tauber / jtauber@jtauber.com >Perth, Western Australia >XML Pages: http://www.jtauber.com/xml/ > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Mon Apr 27 23:09:59 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:00:53 2004 Subject: And now what? (was Re: Inheritance in XML) In-Reply-To: <003001bd71f2$bb39e5c0$d46118cb@caleb> References: <01bd71ef$5916c9b0$020b0ac0@xerius> Message-ID: <3.0.1.16.19980427203712.218f46c2@pop3.demon.co.uk> At 23:39 27/04/98 +0800, James K. Tauber wrote: [...] > >There is an appallingly simple taxonomy at my site and no direct links to >the DTDs themselves (although I plan to change that this week). I'd like to >spin off that part of my XML site and have registered a domain name for that >purpose (although there isn't a site there yet). > >Let's do this! I have a very warm glow this evening - lots of ideas and offers for implementation are starting to bubble up. Several people have mailed about the editor discussion - there's no doubt that communally we have a great deal of complementary experience. Similarly here. I get the impression that the technology and the critical mass of enthusiasts is starting to converge. We also have the experience of working in distributed projects. Perhaps we should try to pull this together in a week or two - allow more postings first. P. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Tue Apr 28 00:31:08 1998 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:00:53 2004 Subject: XML-Data, "&" and inheritance Message-ID: <5BF896CAFE8DD111812400805F1991F701C910E9@red-msg-08.dns.microsoft.com> Paul Prescod wrote "In reviewing XML Data for another project, I note that the XML Data "subclass" mechanism depends on the XML-Data equivalent of the ampersand operator that was removed from XML. I'm not convinced that putting that operator back in was a good idea. It was left ouf of XML because it complicates implementation.". This is a valid criticism, and we need to think about it more. The motivating factor for including the ampersand operator in XML-Data is the significant number of customers who have asked for it. In discussing DTDs with me and others, they showed examples like the following: When I've asked what this construction means, they said, in effect "What I mean is that the elements can occur in any order, but there isn't any good way to say that in XML DTDs." I suspect a number of other people have seen similar examples. I know we could argue that people should not allow variation in element order, but customers have adamantly stated that they sometimes want forgiving sequence. For example, RDF specifies that order does not matter for RDF data. So we have a funny situation in XML in which we've tried to make processing easier by forbidding certain things in the DTD, but the result is that people either avoid DTDs altogether or write bogus DTDs that don't fully describe the real syntax. That is, we've simplified the implementation by being unable to express the intended syntax. Anyway, I wanted to add to the discussion some of the factors that led the authors of XML-Data to propose including the ampersand operator. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cfranks at microsoft.com Tue Apr 28 00:39:58 1998 From: cfranks at microsoft.com (Charles Frankston) Date: Mon Jun 7 17:00:53 2004 Subject: XML-Data, "&" and inheritance Message-ID: Just to take issue with some of what Rick Jelliffe said about XML-Data's schema syntax: > -----Original Message----- > From: Rick Jelliffe [mailto:ricko@allette.com.au] > Sent: Monday, April 27, 1998 11:53 AM > To: xml-dev > Subject: Re: XML-Data, "&" and inheritance > Also, I cannot see why XML-data does not use regular expression syntax > for specifying content models. I can see many good reasons for using > elements for defining types, but not for using elements for > each part of > a content model. I note that the OmniMark program SGML2DTD, which > converts DTDs into a form quite similar to XML-data converts a > certain company's (XXX) version of DOCBOOK DTD from 178K to > about 600K. This seems an enormous overhead. If this is so, then > XML-data schemas (as they are now) are really only suitable > for smaller > DTDs (i.e, those belonging to databases), for web use. > I think there are good reasons to not use regular expression syntax: 1. It is not easily read by those who have not been working with it for many years. I think millions of HTML authors can figure out XML-Data's verbose syntax much more quickly than they can learn regular expressions. I think this makes the verbosity worthwhile. 2. I can use the same XML tools to deal with the XML-Data schema documents as I use to deal with my other XML documents. If regular expressions were used for even part of the syntax, this wouldn't be the case, and I'd need a regular expressio parser. 3. A large schema built from scratch in XML-Data should be able to save a lot of space by using inheritence to avoid copying large sections of similar schema information. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Tue Apr 28 02:51:28 1998 From: jtauber at jtauber.com (James K. Tauber) Date: Mon Jun 7 17:00:53 2004 Subject: And now what? (was Re: Inheritance in XML) In-Reply-To: <3.0.1.16.19980427203712.218f46c2@pop3.demon.co.uk> Message-ID: <000c01bd723f$01eda6e0$d46118cb@caleb> RE: A DTD Repository [Peter Murray-Rust] > Perhaps we should try to pull this together in a week or two > - allow more postings first. Sounds good. I'll keep the domain name warm :-) Here's what I'll do in the next week: - add DTDs I haven't already got - add links to DTDs from my site - mirror DTDs locally where permitted - put together a quick site to point the domain to Here's what I'd like to hear from people about: - DTDs that aren't in Robin Cover's or my list - Metadata requirements for DTDs including: * versioning * dependency * classification taxonomies * additional documentation on meaning of element types (semantics :-) James -- James Tauber / jtauber@jtauber.com Perth, Western Australia XML Pages: http://www.jtauber.com/xml/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Tue Apr 28 04:03:22 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:00:53 2004 Subject: Open Standards Processes (WAS Re: Nesting XML based languages and scripting languages) References: <01bd6f60$4e4db160$d8addccf@uspppBckman> <199804241122.HAA00254@unready.microstar.com> <3541236C.FDF@hiwaay.net> <199804270103.UAA02407@bruno.techno.com> Message-ID: <35453884.5DEC@hiwaay.net> Steven R. Newcomb wrote: > > As for XML, I am inclined to withhold all comment except praise for > the way the XML thing has been handled. It's really quite an > extraordinary example of "recruited public cooperation", and it > reflects very well on both the recruiters and the participants. Yes. It meets the need and will serve. >If all we can see is the political structure that allowed this XML > crystalization to occur, we can't see what really happened. It's so > much more than that, the political structure pales to insignificance. Yes. But the structure is important and so are the rules. OTW, it is an Orwellian future of consortia. As the twig is bent... > In the XML matter, I find particularly praiseworthy the care that has > been taken, by many on all sides, to keep XML and its ISO bases in > harmony with one another. You Know Who You Are, and regardless of how > things turn out, may future generations bless you for your hard work > and thankless commitment to the longterm success of humanity's > Civilization Experiment. I'm pleasantly astonished to discover that, > as near as I can tell, we're pretty much all fighting on the same side > here. We might as well recognize this fact for the miracle that it > is, and take maximum advantage of it. Yes. Still, to tend the future we have to understand the past and never forget that many worked long to carve the paths where there now are roads. By example were we taught, and by example led. > Thanks, Len, for making me pinch myself hard enough to realize that, > after our years in the wilderness together, things are really pretty > damn good and getting rapidly better. And to you, my good friend, whom I have seen achieve much and sacrifice much. Some may bathe in a brighter light for that is the product of a city, but I will always know it was my distinct and undeserved privilege to watch you, your family, and your friend Dr. Goldfarb lead with a torch where once there were no paths, but only that wilderness to cross. Now, there remain the songs. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From terje at in-progress.com Tue Apr 28 04:29:47 1998 From: terje at in-progress.com (terje@in-progress.com) Date: Mon Jun 7 17:00:53 2004 Subject: RELEASE: Interaction 2.0 Server Side XML Message-ID: MEDIA DESIGN IN*PROGRESS RELEASES INTERACTION 2.0: DYNAMIC ADAPTIVE WEBSITES WITH SERVER-SIDE XML San Diego, CA, April 27, 1998: Media Design in*Progress releases the 2.0 version of Interaction, the web server companion for dynamic social websites that adapts to the visitor. Interaction was the first application to take advantage of server-side XML for dynamic websites. The software works with Mac OS Web servers, generating HTML on the fly from XML documents. An evaluation copy can be downloaded from: http://interaction.in-progress.com Interaction allows professional webmasters to benefit from the efficiency and flexibility of extensible markup for developing and maintaining today's advanced websites. The application provides wizards to create, inspect and modify extensible markup, liberating the author from detailed knowledge of the XML specification. Using Interaction, webmasters and authors can create and use their own custom XML elements, XML entities and specialized XML document types when constructing the websites. Interaction generates HTML pages at serving time from XML documents, based on the processing rules of a style sheet and the context of the request. The result is websites that adapt to each visitor without requiring scripting, programming or proprietary "command tag" constructs. The integrated Cascading Style Sheets (CSS) editor simplifies designing and maintaining a consistent presentation of the website. Interaction can optionally emulate CSS to keep the sites accessible and appealing for those using older browsers. Interaction 2.0 is priced at $795, and ships on CD-ROM with printed documentation, telephone- and priority email support, and many components included web based forums, chat rooms, and shopping carts. Personal licenses are priced at $245, and include the on-line distribution with up-and-running support. Upgrades to version 2.0 are free for registered users of Interaction. Customer contact: +1(619)437-0664; sales@in-progress.com. -- Terje | Media Design in*Progress C a s c a d e... a comprehensive Cascading Style Sheets editor for Mac XPublish - for efficient website publishing with XML Make your Web Site a Social Place with Interaction! Check out our web tools at xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From greyno at mcs.com Tue Apr 28 05:11:53 1998 From: greyno at mcs.com (Gregg Reynolds) Date: Mon Jun 7 17:00:53 2004 Subject: Open Standards Processes References: <00d901bd721a$a82bc980$840b4ccb@NT.JELLIFFE.COM.AU> Message-ID: <35453B55.6E7@mcs.com> Rick Jelliffe wrote: > > People have been saying that the W3C process is not open. > I was invited to join the SIG as an outside expert, and not as a > spokeman for any group. > . . . they can only use whatever experts are available. > For markup languages, there are not a real lot. And I (and I am sure > many other newcomers to the XML SIG) found that many issues > being discussed are so complicated that it would hinder progress > if issues had to be re-discussed every time a new person came along: > to be actively involved requires that you study up on what has already > been done, as much as possible. > I can't help wondering if the standards under discussion don't differ qualitatively from your garden-variety standards like how big and yellow a banana must be to cross a border. Standards like XML and her sisters may well be a flash in the pan, to be replaced by the Next Great Thing in a decade or so; or, as I think more likely, they may well have an impact and longevity more akin to Gutenberg's printing press. Absolutely impossible to predict what cultural and political consequences will follow, once the huge investments have been made embedding these standards into our material (electronic? cyber? information?) culture. The experts involved do indeed possess very impressive technical expertise, but perhaps it's too important to leave to the experts. OK, so it's not exactly human cloning; but I have this nagging suspicion that 103 years from now the political scientists and historians will be writing about how the Web (in spite of honorable intentions of its Western designers) managed to serve, without anybody noticing, as yet another subtle instrument of Western/Northern domination rather than as the liberating force so many hope it will be. I hope this doesn't sound too alarmist; but consider the cultural impact of a monopolistic operating system company in the first era of widespread personal computing. People on this list don't need reminding, but the vast majority (at least in the US) have never even learned the there are other ways to compute. If I've misunderstood something I hope somebody will correct me, but if I'm not mistaken pretty much everybody involved is from the the "developed" world, mostly the West. This observation is not to be construed as a slam against the W3C or the people involved, whom I respect a great deal. The W3C has no doubt made excellent good-faith efforts to internationalize the standard; but is there any input from, say, an Indian librarian? An Egyptian computer scientist? An Ugandan Web-site operator? Has the W3C made an effort to seek out qualified professionals from "the South"? I don't see how it's possible for a truly "world"-wide-web to happen without such input. Case in point: in spite of the excellent work that has been done to extend support to non-European languages, none of the current standards, as I understand them, properly support right-to-left writing systems. They may support *content* in Arabic (or Farsi or Hebrew), but the many 10s of millions of people who use the Arabic writing system need to be able to access all aspects of computation in their native languages. It's fine to be able to operate *on* another language; what's needed is the ability to operate *in* that language. Then "they" will not be dependent on "us" for their software. I understand it's no easy matter to rewrite gcc to support c programs written entirely in Urdu, but XML (and XSL and etc) is another matter. It's entirely reasonable (IMO) to write the spec in a way that supports multiple writing systems. Well, if you've read this far, thanks for indulging me. To tell you the truth, the whole reason I got involved in SGML etc is because I wanted to have hypertext versions of the great classics of Arabic literature. Now the possibilities of these technologies are so intoxicating that I get a little excercised at the thought of my 2nd favorite language being passed by. And isn't the thought of contributing to a profound and widespread expansion of freedom more exciting than the prospect of a making a few bucks? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lisarein at finetuning.com Tue Apr 28 05:38:20 1998 From: lisarein at finetuning.com (Lisa Rein) Date: Mon Jun 7 17:00:53 2004 Subject: And now what? (was Re: Inheritance in XML) References: <000c01bd723f$01eda6e0$d46118cb@caleb> Message-ID: <354554CD.BD147086@finetuning.com> Actually, last week, I was talking to a lot of people about how the idea, and Dave Winer liked the idea so much he put one up on his site (like last week): http://betty.userland.com/dtd/default.wsf But so far nobody's bit. But he did it first -- I'm not sure he knows why -- but he took my word for it. And his site gets pretty good traffic.... So I'm trying to write up an intro to DTDs or something to reinforce the concept. If one or two of you would send one in -- even thought I'm pretty sure it's set up in such a way that it is perhaps of no use to you. I hope you guys understand why.... DTD enthusiasm is nothing to be taken lightly! thanks! lisa xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Tue Apr 28 05:56:58 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:00:53 2004 Subject: Inheritance in XML In-Reply-To: <000e01bd70dc$411122c0$dd6118cb@caleb> (jtauber@jtauber.com) Message-ID: <199804280354.UAA21580@boethius.eng.sun.com> [James Tauber:] | Perhaps you misunderstand what I mean by "behaviour". I am referring | to how the content will be display and how it will respond to events | like clicking. I am, too. I am saying that at the machine level, behavior (or appearance) *is* meaning. The meaning of something is the way it makes the machine behave. That's all there is, there ain't no more. | All I am trying to say is that the specification that PatientNames are | to be displayed in blue and are to bring up that patient's record when | clicked on is not the same thing as semantics and we shouldn't call it | semantics. But that further level of meaning you're reaching for is meaning for humans, not meaning for the machine. And right now and for the forseeable future, that level of meaning can be conveyed only by natural language. The meaning of "PatientName" to a machine is the set of behaviors it is supposed to exhibit when presented with something identified as such. The meaning of "PatientName" to you and me is something that involves a knowledge of what a patient is and what a name is, and (more to the point) why and under what circumstances we should care what happens to the person with that name, and (even more to the point) what the relationship is between all these symbols and the actually existent physical object they refer to. We can say all this in prose, but not in a way that will be interpretable by machines for a long time. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lisarein at finetuning.com Tue Apr 28 10:38:05 1998 From: lisarein at finetuning.com (Lisa Rein) Date: Mon Jun 7 17:00:53 2004 Subject: Open Standards Processes References: <00d901bd721a$a82bc980$840b4ccb@NT.JELLIFFE.COM.AU> <35453B55.6E7@mcs.com> Message-ID: <35459B09.E3A7464B@finetuning.com> And isn't the thought of contributing to a profound and > widespread expansion of freedom more exciting than the prospect of a > making a few bucks? > well that's a no-brainer You bet it is. And I can only speak for myself, but I'm starvin on freedom -- but it justs tastes better and better all the time! lisa xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Tue Apr 28 10:39:26 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:00:53 2004 Subject: i18n (was Re: Open Standards Processes) Message-ID: <006e01bd7281$466bf5f0$7f0b4ccb@NT.JELLIFFE.COM.AU> From: Gregg Reynolds >If I've misunderstood something I hope somebody will correct me, but if >I'm not mistaken pretty much everybody involved is from the the >"developed" world, mostly the West. Because to participate means there must be the leisure or finance to do so, and there must be the technological background to do so, and there must be the techno-cultural self-awareness to do so. All these are attributes of the center (or North, or West, whatever you call it.) I asked several Thais when line breaks could occur, for example. The best answer I got was "when it is beautiful". (Actually, in the particular case of Thai and the Indic script languages, I would imagine there will be a great increase in knowledge because of James Clark's interest in the region. Exploration is always done by outsiders.) > The W3C has no doubt made excellent good-faith >efforts to internationalize the standard; but is there any input from, >say, an Indian librarian? An Egyptian computer scientist? An Ugandan >Web-site operator? Has the W3C made an effort to seek out qualified >professionals from "the South"? I don't see how it's possible for a >truly "world"-wide-web to happen without such input. I introduced W3C's Bert Bos to my current boss, a Jordanian with Arabic i18n (internationalization) experience, at the WWW7 conference. Bos said that there was currently no input from Arabic people: no-one (or perhaps none with sufficient credentials) had come forward. The driving force behind W3C i18n, as was clear at the developer's day session, is the need to support the needs of advertisers better. The Web is not a library, it is a TV network posing as a library. So i18n efforts through W3C will be prioritized by market value: Europe, then CJK, then anything else that is easy. If you are concerned about this, the best approach is to ask them exactly what they need: I have found an enormous goodwill to the idea of throrough-going i18n at W3C. Their problem is that they cannot devote resources to finding out what is needed. So make up a nice couple of pages of solutions to real problems that you see, and send it off to Martin Duerst, Jon Bosak and Bert Bos. I am sure they would be delighted for all input: they are gathering information for CSS3 and XSL. When I started looking at "native language markup" it is interesting that the only opposition I got, outside Americans, was from Indians. I think that was because all educated Indians speak English, so if someone uses a computer they are not held back by English markup. Also, markup in a foreign language is very visually distinct. But I cannot agree with them: enumerations in attributes are really a kind of data: so even if an Indian DTD can get away with English element type names, other kinds of names will need an extended range of characters available. > I understand it's no easy matter >to rewrite gcc to support c programs written entirely in Urdu, but XML >(and XSL and etc) is another matter. It's entirely reasonable (IMO) to >write the spec in a way that supports multiple writing systems. SGML made it an explicit goal "there should be no national language dependencies". XML has improved on this, adopting ISO 10646 Universal Character Set (Unicode) and predefining xml:lang for every element type. (I wish they had also predefined xml:script too, but users can do that if they need it.) SGML seems to have spearheaded an awareness of this at ISO. The new guidelines for programming languages mandate language neutrality, and some way of encoding ISO 10646 characters into 8 bit strings are being retrofitted onto most standards. Making UTF-8 the encoding used in 8-bit strings seems the least cost method, if your software is 8-bit clean. Rick Jelliffe PS Since you are particularly interested in Arabic, you may be interested that in my book "The XML and SGML Cookbook", which comes out next month, there is an index in which you can look up the XML numeric character codes for all the arabic characters available in XML, and also a CD-ROM which has some arabic entity sets. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Tue Apr 28 10:43:03 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:00:54 2004 Subject: XML-Data, "&" and inheritance Message-ID: <008301bd7281$d36919b0$7f0b4ccb@NT.JELLIFFE.COM.AU> From: Charles Frankston >I think there are good reasons to not use regular expression syntax: > >1. It is not easily read by those who have not been working with it for many >years. I think millions of HTML authors can figure out XML-Data's verbose >syntax much more quickly than they can learn regular expressions. I think >this makes the verbosity worthwhile. Is XML-data targeted at HTML authors? I thought it was targeted at database people. I would be surprised if database people do not understand regular expressions: nowdays most professionals have been through some formal training. >2. I can use the same XML tools to deal with the XML-Data schema documents >as I use to deal with my other XML documents. If regular expressions were >used for even part of the syntax, this wouldn't be the case, and I'd need a >regular expressio parser. An XML parser has a regular expression parser already: or at least a content-model parser. So the difference is not in the parsing but what becomes of the parsed information: for example, the API could emit XML-data events, but the document could be transmitted using XML content models. >3. A large schema built from scratch in XML-Data should be able to save a >lot of space by using inheritence to avoid copying large sections of similar >schema information. The DTD figures I gave already used a lot of parameter entities, which the SGML2DTD program retains: the bloat is not an artifact of macro expansion. So my figures (178 K to 600K expansion for a version of DOCBOOK) already include a lot of compression. The simple fact is that going from "" to "pq" will bloat out the schema. Which is why I think "(p, q*)" would be much better. But rather than just assertions, I would be interested to see XML-data used for a realistic, real-life DTD like DOCBOOK. I have evidence that a similar system bloats out in a way that I think would be unacceptable for the web: the best way to prove me wrong would be for the XML-data people to actually mark up the DOCBOOK DTD in XML-data, trying to use inheritance. I challenge them to do this, in fact. I predict what would happen is that the result would be large and bloated. I don't think there is much inheritance to be found in what DOCBOOK structures. I would predict that the XML-data proponents would then say (fairly) that the problem is that DOCBOOK DTD was not based on an analysis which exposes inheritance. So to cut directly to that, I think the problem is to think that using inheritance can be a way of simplifying existing DTDs. Parameter entities allow a certain level of compression, and are widely used. But they are certainly far from an inheritance mechanism. Consequently DTD designers do not tend to make specialized versions of general structures, even when it would be desirable: people will have a single table model, rather than one kind of table which must include a figure, one kind which must have 3 columns, one kind which can include footnote and one kind which cannot contain footnotes. Adding an inheritance mechnism will not tend to simplify any existing DTDs, rather it will make specifying richer, more exact DTDs more tractable and doable. Where DOCBOOK has one (or two) types of tables, there would be more types if inheritence could be used. So even if XML-data became as concise as XML in specifying the base structures, a schema with specialized structures using inheritence must be bigger. At the moment. DTDs tend only to have these base structures. Experience shows that for text it is easily possible for a DTD to require hundreds of element types, and that is when general structures are used. I think the XML "terseness is not of major importance" goal should be at the bottom of the list (and possibly off the list) as far as an XML-schema proposal goes. If I have a 10K document, I do not want to have to ship out a 600K schema. The reason that XML allows no markup declarations in the first place is that even a 200K schema is too much for lots of uses. The better approach to this problem is to have as terse a schema syntax as we can: regular expressions provide a great model here: every computer science student has studied them, everyone in document processing knows them, everyone who has used wildcards in Web searches knows the idea. Then use some hypertext convention by which the schema can be held remotely and only the particular relevant definitions can be requested as they are needed: a linking system from element type names (etc) which uses some simple defaulting convention. The document is kept as small as possible (preferably the same size as the DTD-less document) and the schema can be made as elaborate and grand as desired. The Web is based on going from the idea of just plonking in great blobs of text whereever they are needed to having smart links to navigate to the exact resource needed, as it is needed. XML-data as currently formulated is a step back into this pre-hypertext mentality. This compounds the problem of its verbosity. It would be best to deal with this as a hypertext problem, but otherwise at least use regular expression syntax for content models to reduce the verbosity. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From francis at redrice.com Tue Apr 28 11:15:26 1998 From: francis at redrice.com (francis) Date: Mon Jun 7 17:00:54 2004 Subject: Advice on XML editor development References: <3.0.1.16.19980427123854.0d5f0ccc@pop3.demon.co.uk> <3.0.1.16.19980427203651.218f38d2@pop3.demon.co.uk> Message-ID: <35459DDB.1FB8C742@redrice.com> Peter Murray-Rust wrote: > > At 21:30 27/04/98 +0800, James K. Tauber wrote: > > > >What I am envisaging is a generic editor that given an editing sheet for > >MathML would become an equation editor or given an editing sheet CML would > >enable editing of molecular structure diagrams. Of course, such editing > >sheets would include code but that code would presumable have a lot of > >overlap with the code attached to XSL stylesheets for display. I would > >imagine Java Classes, for example with both display and editing interfaces > >for this purpose. > > I have thought about this a lot and have been working on about the third > version for CML support. Say we are working at a node-centered level - e.g. > the DOM/JUMBO or whatever knows about ... but there is no point > in either a tree- or eventstream display or editing of the children. We > click on and might expect the following types of functionality: > - display() // to anywhere , probably a new JFrame > - getDisplayComponent(boolean editable) // returns a JComponent which > // can be embedded in a TabbedPane, Table, etc. > - highlight(String foreignAddress); highlights a subcomponent of the > object (e.g. an atom in a molecule) > - drawToGraphics(Graphics g, Scaler s); > // draw onto an exiting graphics so that many objects can be rendered. > Scaler is a GKS-like scaling (I expect Java2D will supersede that) > > This could be very general - it would allow a molecule and a maths eqn to > be draw to the same surface, either to be edited, parts of them to be > highlighted, etc. > I am reasonably confident this will work - if others are interested in > following this discussion I'd be delighted. > Yes, I'm interested in following this discussion. I'm wrapping web and client-server transactions from an existing toolset in XML, and want to build a tool for editing interfaces which map XML requests to XML sources, which may well have different DTDs. This isn't precisely what you're talking about, I know, but one point of contact, at least, is the component-like, three-phase structure: if I want to re-use these interfaces then I need to be able to (1) build the interface, which involves using the DTD of the source (input) XML; (2) re-use the interface, which involves publishing the target (output) XML DTD so that it can be used as another component's input; and (3) actually execute the interface, reading stuff in the input XML format and generating stuff in the output format. I don't have an SGML background so my head's been hurting a bit as I try to evaluate the options: xml-link/xml-pointer vs namepsaces vs architectures vs XSL vs XML-Data vs ... even the range of different ways of referring to XML elements is slightly numbing! I need to start coding the editor soon - having read David's architectures for XML proposal I'd like to look into that, because I suspect it may allow me to do the whole thing declaratively (of course this may simply show how little I understand it), otherwise I suspect I'll be building a micro-meta DTD (for publishing what input and output formats an interface supports) so that I can start building the editor while encapsulating, and delaying till I understand it better, the actual mapping design and technology. Oh, I'm also London based. If anyone's around who doesn't want to go as far as Paris to be able to talk XML without people's eyes rolling, I'd be on for a drink some time... Cheers - Francis. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Tue Apr 28 11:54:41 1998 From: jtauber at jtauber.com (James K. Tauber) Date: Mon Jun 7 17:00:54 2004 Subject: Inheritance in XML In-Reply-To: <199804280354.UAA21580@boethius.eng.sun.com> Message-ID: <000101bd728a$e24a10c0$bf6118cb@caleb> > [James Tauber:] > > | Perhaps you misunderstand what I mean by "behaviour". I am referring > | to how the content will be display and how it will respond to events > | like clicking. > > I am, too. I am saying that at the machine level, behavior (or > appearance) *is* meaning. The meaning of something is the way it > makes the machine behave. That's all there is, there ain't no more. So why not call it behaviour? > But that further level of meaning you're reaching for is meaning for > humans, not meaning for the machine. And right now and for the > forseeable future, that level of meaning can be conveyed only by > natural language. Regardless of how it is expressed, that is what I mean by semantics. > The meaning of "PatientName" to a machine is the > set of behaviors it is supposed to exhibit when presented with > something identified as such. The meaning of "PatientName" to you and > me is something that involves a knowledge of what a patient is and > what a name is, and (more to the point) why and under what > circumstances we should care what happens to the person with that > name, and (even more to the point) what the relationship is between > all these symbols and the actually existent physical object they refer > to. We can say all this in prose, but not in a way that will be > interpretable by machines for a long time. I dont' think whether or not it can be machine representable has anything to do with it. It sounds like you are saying: my word 'semantics' = meaning to a human (place in ontology, etc) my word 'behaviour' = meaning to a machine (appearance, etc) ALL I AM ASKING FOR is a distinction between the two. The word semantics seems the best to apply to the former. Behaviour is my word for the latter simply because style/appearance doesn't adequately cover event handling. I think we agree that a distinction exists. I just happen to think semantics is a perfectly good word for describing the first but not necessarily the second. Reader (to author): what does it mean when something is in blue? Author: it means it's a PatientName. Reader: what is a PatientName? Author: it means something in blue. For those of you who haven't already guessed, I'm a linguist by training. Perhaps I'm trying to project too much of a formal linguistic view on this matter. James -- James Tauber / jtauber@jtauber.com Perth, Western Australia XML Pages: http://www.jtauber.com/xml/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Tue Apr 28 12:52:42 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:54 2004 Subject: We do not need ampersand (was Re: XML-Data, "&" and inheritance ) References: <5BF896CAFE8DD111812400805F1991F701C910E9@red-msg-08.dns.microsoft.com> Message-ID: <3545B4DE.C516B4DF@technologist.com> Andrew Layman wrote: > So we have a funny situation in XML in which we've tried to make processing > easier by forbidding certain things in the DTD, but the result is that > people either avoid DTDs altogether or write bogus DTDs that don't fully > describe the real syntax. That is, we've simplified the implementation by > being unable to express the intended syntax. Actually, we've made this decision dozens of times in XML. In fact, almost every feature we left out of SGML fell under the heading of "simplified implementation by being unable to express the intended syntax." And SGML *already* left out hundreds of things that people have asked for to help them model things more accurately (such as 5 to 10 occurrences of FOO, mixed in with #PCDATA). These must be checked at the application level. So I don't think that we *need* the ampersand operator, though we might want it. Reasons why we don't are below. > The > motivating factor for including the ampersand operator in XML-Data is the > significant number of customers who have asked for it. In discussing DTDs > with me and others, they showed examples like the following: > > ((firstname|middlename|lastname|age|shoesize|hair|eyes|height|weight)*) > > > When I've asked what this construction means, they said, in effect "What I > mean is that the elements can occur in any order, but there isn't any good > way to say that in XML DTDs." This relates to the fact that SGML straddles the line between an object representation system and a document language definition system. Here is something from a paper I have been working on for a few months: "SGML defines both a language definition system and a (simple) type system." "One of the results of SGML's dual nature as a type system and a language is the existence of attributes. Attributes are like properties on objects. Context-free and regular grammars have no equivalent concept. There is something called an attribute grammar, but those create attributes on the parse tree, not in the language itself. An SGML document is very much like a parse tree, which is why attributes exist and work. But at the same time, they cause problems. Language-based query languages must be artificially enhanced to handle attributes (and Murata's does not yet). Automata must be enhanced to validate them as well. Their inherent lack of ordering (like properties on an object) makes them difficult to translate into a regular-language based framework. The question is: are they useful or convenient? If they are merely convenient because they can be typed quickly, then we can invent a short-form syntax for elements that make those similarly short. The semantic of property-of can be emulated at the application level, as it is today when properties are too complex to be able to fit in attribute values. The opposite view (which I have sometimes held in the past) is that SGML should move wholeheartedly to embrace the object model view of documents, and make attributes even more useful. Attributes could have content models, sub-elements, sub-attributes and so forth. In this view, sequence is only occasionally needed. Paragraphs should be ordered, but the title for a section could be encoded before or after the content of the section, as long as the application can find it (based on its property name) when it needs it. This is similar to object oriented programming or knowledge representation languages where properties can usually be listed in any order. This stands in contrast to the document processing world, where ordering is almost always more important than property-of relationships. Which view you hold probably depends on what your background is, and what problems you are trying to solve right now." This dichotomy explains why some think that XML-Data "disappearing property" subtyping is good enough and others think: "that would only solve a tiny subset of the problem." XML-Data style inheritance is just fine for knowledge representation systems and almost useful for documents. Let me further say that if SGML and XML would move wholeheartedly into the language definition realm and out of the object/property definition world, then we could add language definition features that would allow the modelling of properties *at the application level*. For example, we could define content models that would allow: Blue 800k 2m But not: Blue Red This is a contextual constraint on siblings and can be expressed in Forest-Automaton based DTDs like those described by Murata-san at SGML/XML 97. But in my opinion, the context-sensitive view of the world is not very compatible with the idea of SGML *as* a type system. As soon as you start introducing contextual constraints on the content of elements, you severely weaken the concept of an "element type." What is the content model of the element type above? It depends on what is happening around it! Should each prop element have the same attributes? Maybe not: maybe it should depend on its content. The Forest automata theory is very compatible, on the other hand, with SGML being used *underneath* type systems, as I have demonstrated above. The language enforces uniqueness and the application implies a type system on top of it. We could also move the other way, wholeheartedly into the object definition realm. But then we would still not add the ampersand operator. We should just allow attributes to have content models and structured content. The problem with going in the other direction is that SGML and XML are first and foremost for defining languages, so you can't really deprecate the features that make them powerful in that way. Without that, they are no more interesting than S-expressions. I've been trying to figure out how to move forward both as a language system and a type system for a while. I'm coming to think it is impossible. I am leaning, lately, to the view that we should move to a language-centric view of SGML/XML and move issues of "type" to a higher layer. But I might think differently a few months from now... Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Patrice.Bonhomme at loria.fr Tue Apr 28 15:07:50 1998 From: Patrice.Bonhomme at loria.fr (Patrice Bonhomme) Date: Mon Jun 7 17:00:54 2004 Subject: MSXML : Create a document with a doctype declaration ? Message-ID: <199804281306.PAA25593@chimay.loria.fr> How to create a new XML document with a doctype declaration with msxml ? Something like : ... Thanks. -- ============================================================== bonhomme@loria.fr | Office : B.228 http://www.loria.fr/~bonhomme | Phone : 03 83 59 30 52 -------------------------------------------------------------- * Serveur Silfide : http://www.loria.fr/projets/Silfide * Projet Aquarelle : http://aqua.inria.fr ============================================================== xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at dvs1.informatik.tu-darmstadt.de Tue Apr 28 16:16:02 1998 From: rbourret at dvs1.informatik.tu-darmstadt.de (Ron Bourret) Date: Mon Jun 7 17:00:54 2004 Subject: MSXML : Create a document with a doctype declaration ? Message-ID: <199804281344.PAA16568@berlin.dvs1.tu-darmstadt.de> > How to create a new XML document with a doctype declaration with msxml ? > > Something like : > > > > ... > Try the following, which creates a new Document object, adds elements, and saves it. You might need to change the document encoding for it to work on your machine. The output file is: foobar import com.ms.xml.om.Document; import com.ms.xml.om.Element; import com.ms.xml.util.XMLOutputStream; import com.ms.xml.util.Name; import java.io.FileOutputStream; public class CreateFooBar { static public void main(String argv[]) { try { Document d = createMSXMLDoc(); saveDoc(d); } catch (Exception e) { } } static public Document createMSXMLDoc() { Document d = new Document(); Element xmlPINode, dtdNode, root, child, pcdata; // Create the XML PI node. xmlPINode = d.createElement(Element.PI, "xml"); d.addChild(xmlPINode, null); d.setVersion("1.0"); // Create the DOCTYPE node. dtdNode = d.createElement(Element.DTD); d.addChild(dtdNode, xmlPINode); dtdNode.setAttribute(Name.create("NAME"), Name.create("foo")); dtdNode.setAttribute(Name.create("URL"), Name.create("http://foo.domain.xx")); // Create the root element root = d.createElement(Element.ELEMENT, "foo"); d.addChild(root, dtdNode); // Create a child element child = d.createElement(root, Element.ELEMENT, Name.create("bar"), null); // Create PCDATA for the child element pcdata = d.createElement(child, Element.PCDATA, null, "foobar"); return (d); } static public void saveDoc(Document d) throws Exception { d.setEncoding("ASCII"); d.setOutputStyle(XMLOutputStream.PRETTY); FileOutputStream file = new FileOutputStream("foobar.xml"); XMLOutputStream xmlFile = d.createOutputStream(file); d.save(xmlFile); } } xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cmatei at agora.ro Tue Apr 28 17:19:23 1998 From: cmatei at agora.ro (Crystian) Date: Mon Jun 7 17:00:54 2004 Subject: ANY from Element Type Declarations Message-ID: <01BD72D2.5371AFA0@jackson.agora.ro> Hi, I make a XML parser in Java and I am not sure about what can contain an element that match ANY. Only child elements? Can it contain PCDATA or not? Thank you, Crystian, student at Petru Maior University from Tg.Mures, Romania PS I use W3C Recommendation from 10-Feb-1998 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ray at guiworks.com Tue Apr 28 17:51:56 1998 From: ray at guiworks.com (Ray) Date: Mon Jun 7 17:00:54 2004 Subject: A very meaty editor discussion Message-ID: <199804281550.JAA00499@coldsnap.guiworks.com> Hi All, Recently I've been working on an XML JavaBean for both tree and stream based editing (two-way), and I'd like to share my experiences, and solicit any advice from the experts. I have zippo experience designing XML/SGML stream editors (and you'd be suprised how little info on algorithms exists in the traditional CS literature. Just try finding a mention of "gap buffers" in recent CS textbooks) The editor is a true two-way editor that supports simultaneous realtime views of a document via both tree editing, and stream based editing. Stream based editing can take many forms, such as editing the syntax colored XML source, or, editing a CSS/XSL rendered version. (style applied in realtime) Right now, I'm concentrating on just the syntax colored source. Nothing fancy. At the moment, I am not concerned with layout/rendering, that's a side issue (I'm using the swing.text classes for that right now), merely the algorithms and datastructures used to maintain synchronization between a tree and a stream buffer. First, I'll mention my problems using SAX as perhaps a catalyst for SAX Level 2 / Authoring Interface. 1) no ability to get at DTD info. If a user has internal entities, a dtd, etc.... loading up a document and saving it via SAX results in corruption. 2) no ability to get at unparsed entities (DOM supplies this I believe) This is important if you want to preserve the look of the source. I believe in the power of text, and I think everything should be editable via a text editor and that information should be preserved. If I can't author a document in emacs, load it up into an editor, and go back and forth, something is wrong. 3) comments and CDATA have been mentioned before. I'll note that even Netscape's Composer doesn't preserve comments! I believe comments are fundamentally important, even in the simplest XML applications, because it allows an author to annotate a file inline and transfer it anywhere. You could of course envision annotation being done differently, say with a separate annotation DTD and XML-Link/Xpointers, but would it be as human readable as simple source comments? Problems with DOM 1) No location information 2) I think NodeIterator is too complicated 3) DOM would be great if they added "output" functions, e.g. something like node.getCanonicalStringRepresentation(processChildrenBoolean) Sure, you can cook up your own output functions, but that makes the assumption that the application knows what it is dealing with, and it may be dealing with XML or HTML (depending on what DOM object is being used) My stop gap measure My application uses SAX to parse documents and build a parse tree. However, I regretfully make calls directly to the parser of my choice to get at the DTD info. (e.g. AElfred) So at the moment, I am locked into AElfred. (note, AElfred doesn't give enough info to reconstruct a DTD either :(, but right now, I am just interested in preserving internal entities. The parse tree nodes implement several interfaces, one of which is DOM. My application then uses 100% pure :) DOM interfaces for most of its work, so when parsers start implementing DOM directly, I can switch easily. (ideally, I would like SAX Level 2 to give enough info that the full DOM spec can be implemented on top of it) Each Node implements the DOM interface, plus MutableTreeNode, and swing.text.Element. The Attributes of a Node implements TableModel. All of this is done via inner class delegates. Thus, given a Node, you can write DefaultTreeModel dtm=new DefaultTreeModel(node); JTree jt = new JTree(dtm); // display the tree or JTable jt = new JTable(node.getAttributeTableModel()); This is the elegance of Swing, since a node implements a TableModel interface, you can make the Attributes appear to be a "table" with name and value columns. or EditorKit ek = JEditoPane.createEditorKitForContentType("text/xml"); Document doc = ek.createDefaultDocument(); insertNodes(node.getElementModel()); // complicated function :) In this case, the node returns an interface for swing.text.Element which is an SGMLish representation that Swing's textfields/textareas (and HTML displayer) use. Basically, by using a single datastructure which exhibits multiple models and interfaces, I can keep several different views of the data automatically in sync. For instance, if you update an attribute via JTable, the changes are automagically reflected into the JTree view. Now the difficulty: maintaining symmetry between the text model and the tree model. A small Swing Digression: Fundamentally, Swing's Text views use an internally synchronized SGML <-> Stream Model already. Briefly, Swing maintains an internal object called StringContent which implements an internal interface called 'Content' It is responsible for several things, but primarily it maintains a linear array of characters representing the text of the buffer. As you would expect, the 'model' of text is one of a linear array with a single coordinate, the offset, used to locate data in the model. Another thing the Content object does is maintain an internal array of "bookmarks" into the text. These bookmarks are used by applications to keep a fixed persistent position within a document. For instance, if I bookmark position '32' in the text which happens to have the word 'foo', and the user inserts 5 characters at the start of the document, Content will automatically update this bookmark to point to location 37. When the application tries to dereference this bookmark, he will still find the word 'foo' The bookmarks are represented by the "Position" interface. Simultanteously, Swing's Document model keeps an internal tree of swing.text.Element objects which are projected over the Content object by means of Position (bookmark) references. Each node in the tree has a getStartOffset() and getEndOffset() method which uses a Position to track the beginning and ending of that node in the linear Content model. Thus, if the result of an editing operation causes a piece of text in the linear Content model to be moved by 10 text positions, the Content object will increment the starting and ending bookmarks by 10 positions, such that the Element tree remains accurate. All of this works wonderfully well when you're dealing with plain old text, or styled text (the Swing Notepad and Stylepad examples), but becomes complicated when you need to do parsing. First of all, Swing's default document models have *hardcoded* parsers that you can't replace. This is remarkable because everything else in Swing is more than pluggable, including swing.text where you can plug just every everything. The hard coded parser only really understands line breaks. Swing also has no easy interface for directly inserting nodes. It expects text to be inserted via insertString(). It desparately needs an insertNode(parent, child) on the default Document model. Secondly swing.text.html package shows how you can build an HTML viewer with Swing by parsing a whole document at once, and then bulk inserting the nodes into a Swing DocumentModel, the problem is, it provides no example of how to do a real-time editor. So finally, we come to the raw meat of this post, which is what my ideas are, and whether or not they are viable, and if there is some well known algorithms for this. Requirements for real-time two-way editor: 1) Tree and Source views remain in sync at significant events (commit/save, emacs style electric-characters, or realtime) 2) Output is stable. Performing an insertion on the tree side of the editor cannot cause a "drastic" reordering of the text in the text window. (some tree-dumper algorithms have nasty failure cases) For example, if you store attributes in a Hashtable, you can get the attributes appearing in random order! 3) editing MUST me free. The problem is significantly reduced if you use Henry Thompson's XED approach, but I am a big fan of emacs, and I don't want the editor beeping at me constantly or restricting my movements. I'd like it to help me (tab-completion on element and attribute names), but not restrict me. Think EMACS XML-Mode, Font-Locked, perhaps with right-click context menu, and with a tree browser. Two ideas, One Dumb, and One Untested: Dumb Idea: After designing a stable tree->stream dumper, I use an "ultrafast" XML parser to basically reparse the entire document everytime the user presses an electric character. E.g. if they type a '>' I check to see if they just entered a close tag, or an empty element, and then reparse the entire document. if the parser is very quick, say 1mb/sec, then even a 100k document will only take .1 seconds, hopefully it would be unnoticable. Since Web documents rately exceed 20k, this amounts to about .02 - .2 second delay. (less than garbage collection delays) Untested Idea: When the user starts entering text, they can be in one of several states (Henry Thompson is probably more familar than I am with all the possible ones) the user could be in 1) a CDATA section 2) a text section 3) on an element's name/type 4) on an attribute name 5) on an attribute value 6) within an entity the user could type 1) whitespace 2) ordinary text 3) a special character &, >, <, ", =, ... Let's ignore the cases where the user is just editing identifier names or raw text and get to the nitty gritty which is, what happens if the user types < or > Consider the following situation. Text in the buffer is Hello world (* cursor position) This is already parsed into a tree. The user presses '<', we now have Hello world < which is a non-wellformed document. Let's say the user continues by typing 'BAR>'. Now we have Hello world It's still invalid. In fact, it will stay invalid until the user types Hello World In which case, now we can rightfully insert a BAR node into the Element tree. My approach is to store a list of invalid areas, and to treat invalid areas a CDATA nodes in the TreeView until such time as they become valid. (possibly, syntax coloring them in the stream editor as being bad) Now, the part I haven't thought through totally. When the user enters a '>', First, determine if the tag immediately before the '>' is an empty tag (/>), and if it is, parse it, add it to the tree, and split the invalid region into two new invalid regions (before and after the parsed tag) Otherwise, is this an end tag ()? if it is, scan backwards until we find a matching start tag () in the invalid regions list. Once you find one, parse both the start and end tags, and split both of their invalid regions. Recurse, and parse all invalid regions there are located after the start tag, and before the end tag. For parsing, once I locate the start and end offsets of an element, I'd simply wrap it in a bogus element, send it to SAX via a StringBufferInputStream, and then only watch for a single start/end event. Another approach, rather than the manual recursion technique I outline above, is to locate the start/end offsets of the start/end tags in the invalid regions, and tell SAX to parse the region spanning from the start to the end offset, and replace that whole section in the tree. For instance, find the offset of '<' in '', and the offset of '>' in '', tell SAX to parse everything in between, and split off the invalid regions before and after Phew, ok, those are my ideas, what's the expert opionion. -Ray ray@contentware.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Tue Apr 28 18:09:58 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:54 2004 Subject: ANY from Element Type Declarations Message-ID: <3.0.32.19980428090742.00b085a0@pop.intergate.bc.ca> At 06:20 PM 4/28/98 +-300, Crystian wrote: >I make a XML parser in Java and I am not sure about what can contain an >element that match ANY. Only child elements? Can it contain PCDATA or not? Yes, it can contain text. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ray at guiworks.com Tue Apr 28 18:16:35 1998 From: ray at guiworks.com (Ray) Date: Mon Jun 7 17:00:55 2004 Subject: Is MSXML pure Java? Message-ID: <199804281615.KAA00612@coldsnap.guiworks.com> Hi all, I just installed it to check it out and I noticed that it installs a DLL. It also doesn't come in a standard jar or zip file, but uses an installer which appends to the Microsoft Java VM classes.zip file in \winnt\java\classes. In order words, it seems too close and comfy with the Microsoft VM which scares me into thinking that perhaps it uses one of MS's horible extensions to tie it to IE/Windows. Is it non-pure? Or is the DLL stuff just some kind of ActiveX/DSO bridge junk? The docs aren't JavaDoced either :(, but it does appear to have its own Object Model which seems suspiciously like DOM (which is understandable since MS is one of the submitters), but different enough to be confusing. Any word whether MSXML will eventually support DOM native? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sharris at primus.com Tue Apr 28 18:21:02 1998 From: sharris at primus.com (Steve Harris) Date: Mon Jun 7 17:00:55 2004 Subject: ASCII control characters in XML Message-ID: <4108F4A8F2E7D011A20300A024513B6052BF5B@mailsrv.primus.com> Is it possible to transport UTF-8-encoded text that includes some characters in the byte range x0000-x001F (ASCII control characters)? These codes are valid within UTF-8 (via RFC2044), but the XML specification clearly says that these codes do not constitute 'valid characters'. My application that wraps Clark's "expat" dies upon encountering codes in this range, citing well-formedness violations. I'm looking for the proper method for transporting text that occasionally includes these codes. I've been RTFM'ing this for a while now, and I've found plenty of archived discussion regarding raw binary data as PCDATA content, but this seems closer to common text-processing problem. Any advice or further interpretation would be greatly appreciated. Steven E. Harris Software Engineer PRIMUS 1601 Fifth Avenue, Suite 1900 Seattle, Washington 98101 (206) 292-1001 x436 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Tue Apr 28 19:05:38 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:55 2004 Subject: ASCII control characters in XML Message-ID: <3.0.32.19980428100418.00b47b60@pop.intergate.bc.ca> At 09:21 AM 4/28/98 -0700, Steve Harris wrote: >Is it possible to transport UTF-8-encoded text that includes some >characters in the byte range x0000-x001F (ASCII control characters)? >These codes are valid within UTF-8 (via RFC2044), but the XML >specification clearly says that these codes do not constitute 'valid >characters'. Yes, the XML spec clearly rules these characters out. We didn't discuss it that much during the process - it seemed like a good idea, and nobody on any of the committees seemed troubled at the prospect of losing them; so I'm afraid this is a hardwired characteristic of XML 1.0, and you're stuck with it. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Tue Apr 28 19:09:30 1998 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:00:55 2004 Subject: We do not need ampersand (was Re: XML-Data, "&" and inheritan ce ) Message-ID: <5BF896CAFE8DD111812400805F1991F701C910EF@red-msg-08.dns.microsoft.com> Paul Prescod wrote "SGML defines both a language definition system and a (simple) type system." You raise an issue that I'm not terribly familiar with: "Language system" vs. "Type system." Could explain each of these terms and the distinction between them? Thanks. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Tue Apr 28 19:20:41 1998 From: jtauber at jtauber.com (James K. Tauber) Date: Mon Jun 7 17:00:55 2004 Subject: ASCII control characters in XML In-Reply-To: <3.0.32.19980428100418.00b47b60@pop.intergate.bc.ca> Message-ID: <000401bd72c9$decfdf80$ea6118cb@caleb> At 09:21 AM 4/28/98 -0700, Steve Harris wrote: >Is it possible to transport UTF-8-encoded text that includes some >characters in the byte range x0000-x001F (ASCII control characters)? >These codes are valid within UTF-8 (via RFC2044), but the XML >specification clearly says that these codes do not constitute 'valid >characters'. You can't transport them in a way that an XML processor would recognize them as being in that range and part of a parsed entity. You could use processing instructions; something like and rely on the application to recover the characters. Also, you could define appropriate unparsed entities and then refer to them by attribute (having previously defined x0007 to be an unparsed entity and val to be an entity attribute). James -- James Tauber / jtauber@jtauber.com Perth, Western Australia XML Pages: http://www.jtauber.com/xml/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at dvs1.informatik.tu-darmstadt.de Tue Apr 28 19:50:58 1998 From: rbourret at dvs1.informatik.tu-darmstadt.de (Ron Bourret) Date: Mon Jun 7 17:00:55 2004 Subject: Is MSXML pure Java? Message-ID: <199804281719.TAA17607@berlin.dvs1.tu-darmstadt.de> > Is it [msxml] non-pure? Or is the DLL stuff just some kind of ActiveX/DSO > bridge junk? I can't say for sure that it's pure, but I'm running it on a Sun Unix box just fine. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From David.Brownell at Eng.Sun.COM Tue Apr 28 19:53:30 1998 From: David.Brownell at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:00:55 2004 Subject: ASCII control characters in XML Message-ID: <199804281752.KAA18113@argon.eng.sun.com> > Also, you could define appropriate unparsed entities and then refer to them > by attribute (having previously defined x0007 to be an > unparsed entity and val to be an entity attribute). Or, just define the semantics of the DTD (as usual, using a natural language) to be that the "val" attribute of the "char" element is a numeric string giving a UCS-4 character value. In any case you'll need to add a semantic interpretation ... either that the entity is single character entity, or that it's a number denoting a character. I used the latter, since it's a simpler rule to implement. You get a similar need if you must transmit a UNICODE surrogate, which can't appear in XML (although it can appear in a UTF-16 or UTF-8 encoding of XML, as part of a pair) or any other character outside the range allowed by XML (which maxes out at hex 00.10.00.00). - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Tue Apr 28 19:56:38 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:55 2004 Subject: A very meaty editor discussion In-Reply-To: <199804281550.JAA00499@coldsnap.guiworks.com> References: <199804281550.JAA00499@coldsnap.guiworks.com> Message-ID: <199804281755.NAA06798@unready.microstar.com> Ray writes: > First, I'll mention my problems using SAX as perhaps a catalyst > for SAX Level 2 / Authoring Interface. > > 1) no ability to get at DTD info. If a user has internal entities, > a dtd, etc.... loading up a document and saving it via SAX > results in corruption. First, I'd like to say that I'm flattered that people want to use SAX for this kind of thing, when its original target audience was very different (I guess that a Hyundai in the lot is better than a BMW that's still on order). Secondly, though, I'll pick a nit here and assert that an XML document that passes through SAX is normalised, but not corrupted; that is, if I pass a document through SAX once (with an XML-writing DocumentHandler), then pass it through again, the results will be the same both times. > 2) no ability to get at unparsed entities (DOM supplies this I > believe) This is important if you want to preserve the look of > the source. This is not exactly true, though there was a bug in the January draft of SAX. In the January draft (which you're probably using), unparsed entity and notation information was delivered in a clumsy way through the AttributeMap interface, so you did get information about all notations or unparsed entities that were actually referenced (except in the case of an ENTITIES attribute with multiple entities -- that was the bug). SAX 1.0, which is just about to leave beta (doesn't _anyone_ have bug reports, aside from one JavaDoc typo?), has a new, much simple DTDHandler interface which reports all notation and unparsed entity declarations to the application (but no other DTD information). You can take a peek at 1.0beta at http://www.microstar.com/XML/SAX/New/ > I believe in the power of text, and I think everything should be > editable via a text editor and that information should be > preserved. If I can't author a document in emacs, load it up into > an editor, and go back and forth, something is wrong. I agree strongly, which is why I probably wouldn't use SAX level 1 for an editor (it's meant for downstream processing, where lexical things don't matter). The DOM would seem to be the best match, since an editor will want to store a document tree anyway, but I am happy to go ahead with a level 2 SAX if the XML-Dev members convince themselves (and me) that it could fill a niche that the DOM cannot. > 3) comments and CDATA have been mentioned before. I'll note that > even Netscape's Composer doesn't preserve comments! I believe > comments are fundamentally important, even in the simplest > XML applications, because it allows an author to annotate > a file inline and transfer it anywhere. You could of course > envision annotation being done differently, say with > a separate annotation DTD and XML-Link/Xpointers, but would > it be as human readable as simple source comments? Since you mentioned Emacs LISP, think of the two kinds of comments in the following ELISP function: ;; This function says hello (defun hello () "Display a friendly message at the bottom of the window." (message "Hello!")) The first comment, ;; This function says hello" is purely lexical: it has no special significance to the Emacs byte-code compiler, which will simply discard it. This is the equivalent of a comment in XML source. The second comment (called a doc string), "Display a friendly message at the bottom of the window." is more important -- the compiler will preserve this information with the function definition, and will use it to provide interactive help to the end-user. This is the equivalent of an annotation included in XML source as an element or an attribute value. Comments are important to authors, which is why they would certainly appear in a level-2 SAX if we made one. They _must not_ be important to downstream processing, though; if they are, then they belong in XML markup (possibly even as a pointer to a different document). All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Tue Apr 28 22:24:31 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:55 2004 Subject: We do not need ampersand (was Re: XML-Data, "&" and inheritan ce ) References: <5BF896CAFE8DD111812400805F1991F701C910EF@red-msg-08.dns.microsoft.com> Message-ID: <35463B08.DB633C12@technologist.com> Andrew Layman wrote: > > Paul Prescod wrote "SGML defines both a language definition system and a > (simple) type system." > > You raise an issue that I'm not terribly familiar with: "Language system" > vs. "Type system." Could explain each of these terms and the distinction > between them? Thanks. It comes down to the semantics of the language. SGML allows you to define element types. Thus there is some form of simple type system there. Context free grammars do not allow you to define types. BNFs do not allow you to define types. Regular expressions do not allow you to define types...and so forth. That's one of SGML's big differences -- it has types and a simple type system. Those other things allow you to define languages, but there are no implied or explicit semantics relating to types. As I mentioned before, one reason (IMO) that SGML does not allow context-sensitive content models (much) and attributes (at all) is because a type is supposed to be one thing. All elements of a type are supposed to share semantics. A linguistic view would treat elements as just tokens that may or may not share semantics. More precsely, a lingustic view would expect elements to share semantics when they are used in the same context, but not necessarily when they are used in another context. Consider the grammar for C++. Do round brackets share semantics in that language? Well, they are uniformly used to group things (duh!) but if you ever try to write a C++ compile (don't!!!) you will find that you do not write code to handle "round brackets", because they are just syntactic wrappers and their meaning is completely dependent on context. This is more of a linguistic view. In the type system view, you write one Java class per element type. In the linguistic view you walk the parse tree, not expecting it to inherently have the semantics you want, and translate into something more abstract which you then unleash your Java classes on (which is how parsers often work). Of course there is a continuum between the two views of a document, which is why SGML has successfully straddled the worlds for so long. Maybe it can continue to and still advance. It seems, though, that we have restrained its abilities as a language describer in order to not mess up the type system (e.g. context sensitivity), and restrained its abilities as a type system in order to not mess up the language (e.g. no subtyping). We could continue to move forward with half solutions such as SGML's "exceptions" that provide limited context sensitivity and XML-Data's inheritance that provides limited subtyping, or we could try to separate the layers completely. That would imply, for instance, that instead of having element type declarations, we would have productions (a semantic change) and productions could use as much context sensitivity as they needed. Attributes would not be tied to element types, but rather to contexts. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From EWILLIAMS at cerner.com Wed Apr 29 03:18:37 1998 From: EWILLIAMS at cerner.com (Williams,Ed) Date: Mon Jun 7 17:00:55 2004 Subject: Unsubscribe Message-ID: Unsubscribe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Wed Apr 29 06:40:18 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:00:55 2004 Subject: Open Standards Processes In-Reply-To: <00d901bd721a$a82bc980$840b4ccb@NT.JELLIFFE.COM.AU> (ricko@allette.com.au) Message-ID: <199804290437.VAA22552@boethius.eng.sun.com> [Rick Jelliffe:] | I think Jon Bosak's idea of allowing writers to monitor (if not | participate in) groups is great. A technology like XML needs to be | promoted if it will be popular. Not my idea; Tim Bray's. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Wed Apr 29 11:22:54 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:00:55 2004 Subject: Final alpha release of XED: lots of new features Message-ID: <000b01bd7350$7d8b48e0$1e09e391@mhklaptop.bra01.icl.co.uk> >There are no installation instructions because all you have to >do is unpack the zip file and run xed.exe. Thanks, it works now and I can't reproduce what I did wrong before. The effect was that a DOS window came up and then vanished. Some usability observations: - The help text is in a font that is too small for me to read comfortably - The notation for keystrokes is non-intuitive (e.g. M-<) although it is now explained - Right mouse click on selected text (as distinct from a selected element) doesn't have the expected behaviour (it causes some nearby element to be selected) - Pasting (Ctrl/V) when text is selected is an error, I would expect it to paste the clipboard contents over the selected text - I thought at first that clicking "preferences" had no effect, I discovered later it had brought up a window that was not visible because it was behind others - XED is not very helpful when you try to open a file that is not well-formed XML. (In my case a file I'm still investigating because some parsers accept it and others don't!) - The box below the menu bar looks very strange when it is not actually in use for editing attributes, element names, etc The following suggestions might be asking a bit much: - It would be nice to make (more?) use of the DTD - It would be nice if XED recognised that "Save As" to a different directory can destroy your relative URLs, e.g. the reference to the DTD - It would be nice to have a prompt list of standard entity names, e.g. é - One operation I found cumbersome is splitting an element such as a

paragraph

into two paragraphs. (The same would apply to combining two into one.) I don't know if this is common enought to warrant special treatment. I shall try using it for a bit (in preference to PFE) and see how I get on. Having said that, I don't spend much time editing XML as most of it is software-generated. I do prefer this approach, though, to the "tree" style editor. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ray at guiworks.com Wed Apr 29 11:32:36 1998 From: ray at guiworks.com (Ray) Date: Mon Jun 7 17:00:55 2004 Subject: Final alpha release of XED: lots of new features In-Reply-To: <000b01bd7350$7d8b48e0$1e09e391@mhklaptop.bra01.icl.co.uk> from Michael Kay at "Apr 29, 98 10:23:36 am" Message-ID: <199804290932.DAA04912@coldsnap.guiworks.com> > - It would be nice if XED recognised that "Save As" to a > different directory can destroy your relative URLs, e.g. the > reference to the DTD Umm, Wang has a patent on the "Save As" function, so XED shouldn't have it at all. :) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Wed Apr 29 11:33:58 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:00:55 2004 Subject: Advice on XML editor development Message-ID: <001601bd7351$da5e4bc0$1e09e391@mhklaptop.bra01.icl.co.uk> >Someone wrote: >> No disrespect to all those in the XML community, but the bulk >> of the world uses tools other than Java. And it would be >> nice to think that we, too, can make use of XML. > >I agree. At this point, Java is very high profile but makes up only a >tiny percentage of commercial projects. > There seems to be confusion here. Java is a wonderful tool for developing objects in general and XML tools in particular; ActiveX is a good environment for assembling and integrating objects written in different languages. So the obvious thing to do is to write your ActiveX objects in Java. It is quite wrong to think that writing in Java cuts you off from the non-Java community. Specifically, I have found that Microsoft's javareg tool makes it very easy to integrate Java classes into applications written in languages such as VB and VBScript. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Wed Apr 29 13:48:01 1998 From: jtauber at jtauber.com (James K. Tauber) Date: Mon Jun 7 17:00:56 2004 Subject: Final alpha release of XED: lots of new features In-Reply-To: <199804290932.DAA04912@coldsnap.guiworks.com> Message-ID: <001a01bd7364$91b00e00$bd6118cb@caleb> > > - It would be nice if XED recognised that "Save As" to a > > different directory can destroy your relative URLs, e.g. the > > reference to the DTD > > Umm, Wang has a patent on the "Save As" function, so XED shouldn't > have it at all. That's right. XML is just videotext :-) James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Wed Apr 29 13:49:08 1998 From: jtauber at jtauber.com (James K. Tauber) Date: Mon Jun 7 17:00:56 2004 Subject: ActiveXpat? Message-ID: <001b01bd7364$92dca720$bd6118cb@caleb> I'm writing some doco on expat and am building some tiny sample applications that use it. It occurs to me, has anybody made an ActiveX component out of expat? James -- James Tauber / jtauber@jtauber.com Perth, Western Australia XML Pages: http://www.jtauber.com/xml/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pducharme at lycos.com Wed Apr 29 15:49:36 1998 From: pducharme at lycos.com (pducharme@lycos.com) Date: Mon Jun 7 17:00:56 2004 Subject: unsubscribe Message-ID: <852565F5.004BC71A.00@pghmta2.mis.pgh.lycos.com> unsubscribe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msuzio at anecdote.com Wed Apr 29 17:29:14 1998 From: msuzio at anecdote.com (Michael J. Suzio) Date: Mon Jun 7 17:00:56 2004 Subject: MARC records? Message-ID: <3547EFE8.C09D7B56@anecdote.com> Does anyone know of an XML DTD for bibliographic MARC records? Library of Congress notes an SGML DTD effort: http://www.loc.gov/marc/marcdtd/marcdtdback.html I'm curious if this is applicable in an XML context, too (I haven't looked closely at the DTD itself, and I'm not as SGML savvy as I am XML savvy, I'm actually *used* to the more minimal, less-complex XML DTD structure). Any information folks have on (SG|X)ML intersections with MARC would be appreciated. I'm trying valiantly to understand these buggers, and hope to integrate an XML representation of the records into a searching / indexing system. Any way I can take these (nigh-unreadable) records and massage them into a nicer representation would be great. -- Michael J. Suzio Interconnect of Ann Arbor msuzio@anecdote.com / 1-734-665-5342 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Wed Apr 29 18:01:43 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:00:56 2004 Subject: A SAX helper class: multiple document handlers Message-ID: <000301bd7388$326a0f80$1e09e391@mhklaptop.bra01.icl.co.uk> There was discussion during the SAX specification debate of the requirement to notify multiple handlers of the same event. I have written a simple (trivial!) handler which does this (for document events only) and offer it as a candidate for a standard "helper" class. It is called MultiHandler, it implements the DocumentHandler interface, and in addition supports the method: addDocumentHandler(DocumentHandler d) which allows the "real" application to register any number of "real" DocumentHandlers to receive notification of the events. The registered handlers are called in the order they were registered, except for endElement and endDocument, which are called in the reverse order. This may seem strange, but I found this useful when the handlers were generating XML or HTML to the same output stream and I wanted it properly nested. You can find the java source on http://home.iclweb.com/icl2/mhkay/MultiHandler.java Any comments appreciated. Regards, Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Wed Apr 29 18:09:50 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:00:56 2004 Subject: MARC records? Message-ID: <000901bd7389$44e34220$1e09e391@mhklaptop.bra01.icl.co.uk> >Any information folks have on (SG|X)ML intersections with MARC >would be appreciated One point of intersection, though probably not what you had in mind, is that MARC uses the very peculiar ANSEL character set, which is also used by GEDCOM. So in creating GedML I solved the problem of ANSEL to UNICODE conversion and this should be reusable for any MARC convertor you have in mind. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mskow at earthling.net Wed Apr 29 19:37:38 1998 From: mskow at earthling.net (Mike Skowronski) Date: Mon Jun 7 17:00:56 2004 Subject: DOM implementation In-Reply-To: <3540D55A.2539C161@webeasy.com> Message-ID: there's an (alpha) implementation in python at http://www.math.jussieu.fr/~fermigie/python/ On Fri, 24 Apr 1998, Pax Prakarsa wrote: > Could any body tell me what other DOM implementation are available out there, > other than: > Don Park's SAXDOM and Data Channel's DXP. > > Does Data Channel's DXP actually use Don Park's SAXDOM ? > > Thanks > Pax > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ray at guiworks.com Wed Apr 29 19:50:42 1998 From: ray at guiworks.com (Ray) Date: Mon Jun 7 17:00:56 2004 Subject: DOM implementation In-Reply-To: from Mike Skowronski at "Apr 29, 98 01:37:00 pm" Message-ID: <199804291748.LAA09255@coldsnap.guiworks.com> > > Could any body tell me what other DOM implementation are available out there, > > other than: > > Don Park's SAXDOM and Data Channel's DXP. > > > > Does Data Channel's DXP actually use Don Park's SAXDOM ? > > IBM's XML for Java implements the latest DOM spec, but it appears to be bugged somewhat. As far as I can tell, the JAR file has some of the old org.w3c.dom interfaces in it, and the parser doesn't completely implement all of DOM, especially in the area of DTD info. All of the DTD info is available through IBM specific functions, but the DOM functions are just stubbed with no implementation. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricker at xmls.com Wed Apr 29 20:01:58 1998 From: ricker at xmls.com (Jeffrey Ricker) Date: Mon Jun 7 17:00:56 2004 Subject: XML Exchange Message-ID: <199804291801.OAA08670@mclean2.his.com> We are happy to announce the launch of XML Exchange, http://www.xmlx.com a forum for creating and sharing industry-specific document type definitions. We are aggressively adding content and capability to this site. The forums, which are the heart of the site, are up and running now. We hope that you will join us. We have created this site as a service to the XML community. You are that community. As such, we heartily welcome in recommendations you may have to improve the site. We look forward to hearing from you at XML Exchange. Jeffrey Ricker XMLSolutions, LLC ricker@xmls.com 703.585.5000 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cys at arbortext.com Wed Apr 29 20:28:01 1998 From: cys at arbortext.com (Cynthia Lenora Shern) Date: Mon Jun 7 17:00:56 2004 Subject: Is anybody working on ... Message-ID: <98Apr29.142509edt.26888@thicket.arbortext.com> Is anybody working on creating an abstration for preserving formatting information inside or along with SGML documents, in cases where format really is part_of the information. Two examples where such an approach might make sense could be: Examples: 1) K-12 textbooks 2) Printed survey forms for large market research firms. In example 1, there's a pedagogical relationship between text and illustrations. It occurs to me that a relationship this critical to the purpose of the information certainly isn't throw_away. In example 2, there's a value in preserving both page fidelity and page integrity if information is moved from one digital format to another, so that inconsistent or erroneous data isn't collected. (What I'm trying to say here is that if they move to SGML, questions on a page can't move undeterminately as it will throw questions in to the reliability of the data collected.) In examples 1 and 2 above, in fact format is part_of the knowledge one would want to capture about the content. I wonder if perhaps there's a place for a level of abstraction, SGML expressed, that is about_format although not necessarily processor specific. It occurs to me (although I must plead a very slender thread of evidence in this regard), that Hytime syntax would lend itself to modelling such abstractions. Since some of Hytime is included in the XML set of specifications, I think this mailing list is a good place to post this question. Cynthia Shern Product Specialist Arbortext, Inc. 1000 Victors Way, Suite 100 Ann Arbor, MI 48108 (734) 997-0200 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Wed Apr 29 20:52:34 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:00:56 2004 Subject: Is anybody working on ... Message-ID: <003401bd73a0$31203d80$b00b4ccb@NT.JELLIFFE.COM.AU> From: Cynthia Lenora Shern >Is anybody working on creating an abstration for preserving formatting >information inside or along with SGML documents, in cases where format >really is part_of the information. Two examples where such an approach >might make sense could be: > > Examples: > 1) K-12 textbooks Pardon my ignorance: what is K-12? > 2) Printed survey forms for large market research firms. The Taiwanese have worked on this problem. They have the particular problem that the only official version of a document is the paper one physically stamped by the boss, so there is a great need to maintain various kinds of page fidelity. When I last heard, they were trialing a twin DTD system. One was a forms design DTD: it basically had lines, boxes, constant text, and named fields. That sets the boilerplate. Then there was a database DTD in which the actual field data is given. The two get linked by simple name referencing. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Wed Apr 29 20:54:36 1998 From: jtauber at jtauber.com (James K. Tauber) Date: Mon Jun 7 17:00:56 2004 Subject: Is anybody working on ... In-Reply-To: <98Apr29.142509edt.26888@thicket.arbortext.com> Message-ID: <002e01bd73a0$24004140$bd6118cb@caleb> > Is anybody working on creating an abstration for preserving formatting > information inside or along with SGML documents, in cases where format > really is part_of the information. I'm wondering if something like this could be done with the Precision Graphics Markup Language (see http://www.jtauber.com/xml/appl/pgml.html) with out-of-line links asserting relationships between the precise formatting expressed in PGML and the document in a more application-specific (eg survey) DTD. James -- James Tauber / jtauber@jtauber.com Perth, Western Australia XML Pages: http://www.jtauber.com/xml/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Wed Apr 29 21:06:30 1998 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:00:57 2004 Subject: Is anybody working on ... Message-ID: <3.0.32.19980429140445.00755f0c@postoffice.swbell.net> At 04:53 AM 4/30/98 +1000, Rick Jelliffe wrote: >>Is anybody working on creating an abstration for preserving formatting >>information inside or along with SGML documents, in cases where format >>really is part_of the information. Two examples where such an approach >>might make sense could be: I think ADOBE has claimed that they will be able to preserve the SGML structure inside a PDF object, which would be a reasonable solution for documents where the mapping from the structure to the presentation is fairly straightforward. It makes no sense at all if there's a lot of data processing from input to output because the transform can't be run in reverse. The only possible general solution is to bind a source document to either a complete style spec (e.g., DSSSL spec) or to a formatting program object that encapsulates the formatting. Any of the current and proposed packaging schemes would handle this, possibly in conjunction with conventional uses of XLinks, e.g.: Document source Style spec Cheers, E. --
W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Wed Apr 29 21:13:30 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:57 2004 Subject: Is anybody working on ... References: <98Apr29.142509edt.26888@thicket.arbortext.com> Message-ID: <35477BDE.13AC2A94@technologist.com> I would think that in the case where the formatting is an intrinsic part of the meaning of a document, it should be represented in the markup just like everything else. XML encourages you to separate formatting from the abstraction, but it does not require you to. As someone else pointed out, Precision Graphics Markup Language is an XML-based language designed explicitly for formatting -- but it is still XML. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Wed Apr 29 21:32:30 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:57 2004 Subject: Layout and Logical Structure in XML (was Re: Is anybody working on ...) In-Reply-To: <3.0.32.19980429140445.00755f0c@postoffice.swbell.net> References: <3.0.32.19980429140445.00755f0c@postoffice.swbell.net> Message-ID: <199804291931.PAA02594@unready.microstar.com> [re: preserving formatting with XML] The demands of codicographers and bibliographers are even more extreme than those of market researchers or K-12 textbook readers. When I was doing my analysis of scribal spelling patterns in the major Old English poetic manuscripts, I needed to markup page breaks, line breaks, corrections and erasures, changes in scribal hand, editorial variants, and sometimes even word spacing _in addition to_ the metrical (a verse/b-verse) and logical (text/fitt) structure of the manuscripts. The Text Encoding Initiative has already spent around a decade working on the problem of marking up concurrent structures such as these. Some of their solutions involve features not available in XML 1.0 (especially inclusion exceptions), but it would be well worth while to visit their site, if only to steal some ideas: http://www.uic.edu/orgs/tei/ All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murray at muzmo.com Wed Apr 29 21:35:33 1998 From: murray at muzmo.com (Murray Maloney) Date: Mon Jun 7 17:00:57 2004 Subject: Is anybody working on ... In-Reply-To: <35477BDE.13AC2A94@technologist.com> References: <98Apr29.142509edt.26888@thicket.arbortext.com> Message-ID: <3.0.1.32.19980429153226.00707d20@pop.uunet.ca> At 03:13 PM 4/29/98 -0400, Paul Prescod wrote: >I would think that in the case where the formatting is an intrinsic part >of the meaning of a document, it should be represented in the markup just >like everything else. XML encourages you to separate formatting from the >abstraction, but it does not require you to. I don't want to spend much time on this point, but I have to disagree with Paul's assertion. It is not the case that SGML encourages the you to "separate formatting from the abstraction". It is absolutely true that some SGML practicioners promote this position. And it is also true that this is an appropriate position to hold in many circumstances. However, neither the SGML standard nor the XML specification "encourage" you to adopt this position. Regards, Murray +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Murray Maloney Email: murray@muzmo.com Technical Director Phone: (905) 509-9120 Veo Systems Fax: (905) 509-8637 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Make a Tax-Deductible Donation Yuri Rubinsky Insight Foundation http://www.yuri.org/donate.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rakesh at watson.ibm.com Wed Apr 29 22:26:27 1998 From: rakesh at watson.ibm.com (Rakesh Mohan) Date: Mon Jun 7 17:00:57 2004 Subject: XML spec DTD and s/w Message-ID: <000001bd73ad$3a48ade0$10500209@turing2.watson.ibm.com> The XML spec at http://www.w3.org/TR/1998/REC-xml-19980210.xml has At 11:28 PM 4/29/98 -0400, Michael J. Suzio wrote: >Does anyone know of an XML DTD for bibliographic MARC records? > >Library of Congress notes an SGML DTD effort: >http://www.loc.gov/marc/marcdtd/marcdtdback.html > >I'm curious if this is applicable in an XML context, too (I >haven't looked closely at the DTD itself, and I'm not >as SGML savvy as I am XML savvy, I'm actually *used* to the >more minimal, less-complex XML DTD structure). The MARC SGML DTD would be useful as an architectural DTD. I believe that Eliot Kimber has done some really interesting stuff using a MARC dtd for architectural forms. John john@spinosa.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Apr 30 02:04:10 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:00:57 2004 Subject: XML spec DTD and s/w Message-ID: <3.0.32.19980429170150.00ae1860@pop.intergate.bc.ca> At 04:27 PM 4/29/98 -0400, Rakesh Mohan wrote: >The XML spec at > >http://www.w3.org/TR/1998/REC-xml-19980210.xml > >has >does anyone know a location where spec.dtd can be accessed. > >Also, I would like to know the s/w, process used to convert this xml to the >html >version. Soon to be available at the W3C site (I think). It was converted to HTML using a custom Java formatter that was based on Lark because I wrote it but could be based on anything that constructs any sort of sane document tree. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Apr 30 02:46:06 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:57 2004 Subject: SAX 1.0beta: Three bugs so far Message-ID: <199804300045.UAA00461@unready.microstar.com> *PLEASE* send in your SAX 1.0beta bug reports, documentation corrections, or requests for clarification as soon as possible. So far, aside from several useful documentation corrections and requests for clarification, I have received three reports of what I consider to be SAX bugs rather than feature-change requests. If anyone disagrees with my interpretation of these, please let me know: BUG #1: SAXException extends java.io.IOException This was not part of my final design: it is a relic from an earlier draft, and should have been eliminated. I have corrected SAXException.java so that the SAXException class extends java.lang.Exception. BUG #2: Parser.setLocale takes only one String argument As will quickly become apparent, I am not an expert in localisation. I have discovered that localisation requires both a language code _and_ a country code, so I have changed the interface prototype to public abstract void setLocale (String language, String country) throws SAXException; Does this look correct? Would people prefer that I use the java.util.Locale class? BUG #3: The ParserFactory helper class uses the system property "sax.parser" This is a Java-specific class, so it should follow Java's conventions for system-property names. The property for specifying the default parser class is now "org.xml.sax.parser". The most important documentation clarification comes for the InputSource class: I have made it clear that an InputSource object provided by the application belongs to the application, and that the parser should not modify it (by filling in InputStreams or Readers, for example). The parser is, of course, free to create its own InputStream objects for internal use. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu Apr 30 02:58:28 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:57 2004 Subject: Is anybody working on ... References: <98Apr29.142509edt.26888@thicket.arbortext.com> <3.0.1.32.19980429153226.00707d20@pop.uunet.ca> Message-ID: <3547C8AE.B8E1ED34@technologist.com> Murray Maloney wrote: > > However, neither the SGML standard nor the XML specification > "encourage" you to adopt this position. It is certainly the case that SGML was designed specifically to allow the separation of formatting and abstraction. That's easy to verify. It is also the case, in my opinion, that the features provided in SGML encourage this position in the same way that the features of Java encourage multiplatform, networked development. I also believe that there is non-normative introductory text in the SGML standard to that effect, but I don't have my copy handy right now so I coudl be wrong on that. Still, it isn't wrong to use Java in a way that is tied to a particular platform nor to use SGML in a way that is tied to a particular formatter. I think that we agree on that central point. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu Apr 30 02:58:48 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:57 2004 Subject: Is anybody working on ... References: <002e01bd73a0$24004140$bd6118cb@caleb> Message-ID: <3547C953.A2A4288A@technologist.com> James K. Tauber wrote: > > > Is anybody working on creating an abstration for preserving formatting > > information inside or along with SGML documents, in cases where format > > really is part_of the information. > > I'm wondering if something like this could be done with the Precision > Graphics Markup Language (see http://www.jtauber.com/xml/appl/pgml.html) > with out-of-line links asserting relationships between the precise > formatting expressed in PGML and the document in a more application-specific > (eg survey) DTD. It isn't clear to me why information that the questioner feels is intrinsic to the document should be linked instead of embedded. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From kent at trl.ibm.co.jp Thu Apr 30 03:29:35 1998 From: kent at trl.ibm.co.jp (TAMURA Kent) Date: Mon Jun 7 17:00:57 2004 Subject: SAX 1.0beta: Three bugs so far In-Reply-To: David Megginson's message of "Wed, 29 Apr 1998 20:45:49 -0400" <199804300045.UAA00461@unready.microstar.com> References: <199804300045.UAA00461@unready.microstar.com> Message-ID: <199804300127.KAA26384@ns.trl.ibm.com> > BUG #2: Parser.setLocale takes only one String argument > Does this look correct? Would people prefer that I use the > java.util.Locale class? I think it should be java.util.Locale. -- TAMURA, Kent @ Tokyo Research Laboratory, IBM Japan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From robin at ACADCOMP.SIL.ORG Thu Apr 30 03:33:11 1998 From: robin at ACADCOMP.SIL.ORG (Robin Cover) Date: Mon Jun 7 17:00:57 2004 Subject: separation of formatting... Message-ID: <199804300140.UAA05901@ACADCOMP.SIL.ORG> Re: Date: Wed, 29 Apr 1998 20:41:18 -0400 From: Paul Prescod Murray Maloney wrote: > > However, neither the SGML standard nor the XML specification > "encourage" you to adopt this position. It is certainly the case that SGML was designed specifically to allow the separation of formatting and abstraction. That's easy to verify. -------- One of the ironies of the territory, no? We can (do) use SGML languages to define stylesheets, and if we want to be pernicious, we can use SGML to define markup languages that are completely antithetical to the "values" that are implicit in SGML, yet which are not subject to enforcement by SGML, which disclaims interest in semantics. And so, license for subversion is unbounded. So with primitive myths of our origins, and of language as a feature of humanness: the same tongue that is created to bless can curse. In a territory less sublime: we are now almost forced, given post-modern consensus, that the poem escapes the author (-ial intent) and becomes the property of the community, however bitter the lament of protest by the author. I think these are ideas that don't apply, at least so obviously, to programming languages. SGML does (did) have a religious aspect to it, but provided inadequate formal means to guard its dogma. -rcc ------------------------------------------------------------------------- Robin Cover Email: robin@acadcomp.sil.org 6634 Sarah Drive Dallas, TX 75236 USA >>> The SGML/XML Web Page <<< Tel: +1 (972) 296-1783 (h) http://www.sil.org/sgml/sgml.html Tel: +1 (972) 708-7346 (w) FAX: +1 (972) 708-7380 ========================================================================= xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From greyno at mcs.com Thu Apr 30 06:41:30 1998 From: greyno at mcs.com (Gregg Reynolds) Date: Mon Jun 7 17:00:57 2004 Subject: i18n (was Re: Open Standards Processes) References: <006e01bd7281$466bf5f0$7f0b4ccb@NT.JELLIFFE.COM.AU> Message-ID: <3547F361.3FB9@mcs.com> Rick Jelliffe wrote: > > From: Gregg Reynolds > > >If I've misunderstood something I hope somebody will correct me, but if > >I'm not mistaken pretty much everybody involved is from the the > >"developed" world, mostly the West. > > Because to participate means there must be the leisure or finance to > do so, and there must be the technological background to do so, and > there must be the techno-cultural self-awareness to do so. All these > are attributes of the center (or North, or West, whatever you call it.) Alas. > > I asked several Thais when line breaks could occur, for example. The > best answer I got was "when it is beautiful". Best possible answer! > The Web is not a library, it is a TV network posing as a library. Ah, but we'll *make* it a library. > > If you are concerned about this, the best approach is to ask them exactly > what they need: I have found an enormous goodwill to the idea of > throrough-going i18n at W3C. Their problem is that they cannot devote > resources to finding out what is needed. So make up a nice couple of > pages of solutions to real problems that you see, and send it off to > Martin Duerst, Jon Bosak and Bert Bos. I am sure they would be > delighted for all input: they are gathering information for CSS3 and XSL. > I'll do my best. Personally I'd like nothing better than to have a real influence in extending computational rights beyond the developed world, but it is precisely the notion of Western "experts" (or people like me - "I'm not really an expert -but I play one on the internet!") shaping such stuff without "native" input that makes me a little uneasy. But you rightly point out that that's the way of the world. I'm also interested in your boss' view on such issues. Can you put me in touch with him? Actually I think the best thing I could do would be to find the appropriate people and get them involved. > When I started looking at "native language markup" it is interesting that > the only opposition I got, outside Americans, was from Indians. I think that > was because all educated Indians speak English Sorry, I can't resist this one: I think you mean all Indians educated in English. I would hesitate to say that a highly trained Ayurvedic physician without English, or an Indian Imam who has memorized the Quran but never learned English was somehow uneducated. Of course then we get into complex social dynamics of status, power, etc within a society. Which means, now that I think of it, that my ardor for extending computational rights is itself a rather colonialist attitude. After all it is rather like telling a traditional society that they need to be more open, like us, isn't it? Hmmm. Maybe I'll just stick with "*I* want to read and write Arabic on the net!" and leave the Liberation Computer to somebody else. (Prediction: within 10 years we'll see a move to declare computation as a fundamental human right. Your read it here first.) Thanks much for your very informative and sensitive response. And I'll definitely take a look at your book The XML and SGML Cookbook. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Thu Apr 30 07:26:47 1998 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:00:57 2004 Subject: Off track References: <006e01bd7281$466bf5f0$7f0b4ccb@NT.JELLIFFE.COM.AU> <3547F361.3FB9@mcs.com> Message-ID: <35480B63.57E7294D@allette.com.au> Gregg Reynolds wrote: > Sorry, I can't resist this one: I think you mean all Indians educated > in English. I would hesitate to say that a highly trained Ayurvedic > physician without English, or an Indian Imam who has memorized the Quran > but never learned English was somehow uneducated. It would be perfectly accurate to say they were uneducated. Just as it would be perfectly accurate for them to say the same about us. > Which means, now that I think of it, that my ardor for extending > computational rights is itself a rather colonialist attitude. XML is a colonialist's manifesto. Although colonialism has never been ranked as any sort of a sucess in hindsight, a colonialist might just be a frontrunner of the inevitable tide. It might not be an ideal scenario, but it's not going away... -- Regards Marcus Carr email: mrc@allette.com.au _______________________________________________________________ Allette Systems (Australia) email: info@allette.com.au Level 10, 91 York Street www: http://www.allette.com.au Sydney 2000 NSW Australia phone: +61 2 9262 4777 fax: +61 2 9262 4774 _______________________________________________________________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Thu Apr 30 11:11:18 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:00:58 2004 Subject: A DTD Generator Message-ID: <005f01bd7418$080dc840$1e09e391@mhklaptop.bra01.icl.co.uk> If you share my distaste for the syntax of DTDs and would rather wash the car than sit down to write one, you may be interested in a little utility I have written called DTDGen. It takes a well-formed XML document and generates a DTD for it. Of course, it has limitations. It is trying to induce general rules from one example. But I've found it useful in a number of contexts: * you can use it to produce a "first cut" DTD which you can then refine by hand * you can construct an XML document that exhibits as many structural features of your document design as possible and then use DTDGen to generate something close to the "true" DTD * you can use it to gain an understanding of the structure of XML documents that have been published without a DTD (an example being the XML specification itself) This is homespun software not a professional product, use it accordingly. You can download it from http://home.iclweb.com/icl2/mhkay/dtdgen.zip You will need an XML parser and SAX 1.0 driver (and of course a Java 1.1 environment). The download includes java source and class files (copy these to your classpath) and an HTML document explaining what it does and what it doesn't. Let me know of any problems. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu Apr 30 13:32:58 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:58 2004 Subject: Off track References: <006e01bd7281$466bf5f0$7f0b4ccb@NT.JELLIFFE.COM.AU> <3547F361.3FB9@mcs.com> <35480B63.57E7294D@allette.com.au> Message-ID: <35486185.219C9105@technologist.com> Marcus Carr wrote: > > XML is a colonialist's manifesto. Although colonialism has never been ranked as > any sort of a sucess in hindsight, a colonialist might just be a frontrunner of > the inevitable tide. It might not be an ideal scenario, but it's not going away... Well, now we're getting really off track, but I think that XML does more than any comparable Web specification to allow linguistic minorities to maintain their language in the face of the other, inevitable changes like mobility and cheap communication. XML is not the frontrunner of cultural colonialism -- it is a moderating factor (albeit a small one). Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu Apr 30 13:43:08 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:00:58 2004 Subject: separation of formatting... References: <199804300140.UAA05901@ACADCOMP.SIL.ORG> Message-ID: <354863C7.646D3B4A@technologist.com> Robin Cover wrote: > > I think these are ideas that don't apply, at least so obviously, > to programming languages. SGML does (did) have a religious aspect > to it, but provided inadequate formal means to guard its dogma. I don't believe that SGML has a dogma (though many practitioners do). It allows a better way of doing things, but it also explicitly allowed the old way (else, why processing instructions?). If the ISO WG believed completely in the "abstraction not formatting" mantra, then they would never have allowed the Standard Page Description Language to be developed. I think they believe: "abstraction, not processing instructions, when possible." Paul Prescod - http://itrc.uwaterloo.ca/~papresco "Perpetually obsolescing and thus losing all data and programs every 10 years (the current pattern) is no way to run an information economy or a civilization." - Stewart Brand, founder of the Whole Earth Catalog http://www.wired.com/news/news/culture/story/10124.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Apr 30 14:10:59 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:58 2004 Subject: Is anybody working on ... In-Reply-To: <3547C8AE.B8E1ED34@technologist.com> References: <98Apr29.142509edt.26888@thicket.arbortext.com> <3.0.1.32.19980429153226.00707d20@pop.uunet.ca> <3547C8AE.B8E1ED34@technologist.com> Message-ID: <199804300119.VAA01383@unready.microstar.com> Paul Prescod writes: > Still, it isn't wrong to use Java in a way that is tied to a > particular platform nor to use SGML in a way that is tied to a > particular formatter. I think that we agree on that central point. I don't think that that's the suggestion (though I agree that it would be a possible application of SGML/XML). Rather, the suggestion is that people may need to encode physical information about a text when that information is required for useful processing, possibly with a wide range of XML processing tools. More controversially, you could say that the formatting information is semantically-significant. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Thu Apr 30 14:25:11 1998 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 17:00:58 2004 Subject: Off track Message-ID: >XML is a colonialist's manifesto. Although colonialism has never been ranked as >any sort of a sucess in hindsight, a colonialist might just be a frontrunner of >the inevitable tide. It might not be an ideal scenario, but it's not going away... Apart from the number of people working on XML who are from assorted empires and settler states, I'm really not sure what on earth you're talking about. I know there are several issues with Unicode (65000 characters isn't enough), but what in XML is so 'colonial'? Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From greyno at mcs.com Thu Apr 30 14:26:24 1998 From: greyno at mcs.com (Gregg Reynolds) Date: Mon Jun 7 17:00:58 2004 Subject: Off track References: <006e01bd7281$466bf5f0$7f0b4ccb@NT.JELLIFFE.COM.AU> <3547F361.3FB9@mcs.com> <35480B63.57E7294D@allette.com.au> <35486185.219C9105@technologist.com> Message-ID: <35485FAD.2BBB@mcs.com> Paul Prescod wrote: > > Marcus Carr wrote: > > > > XML is a colonialist's manifesto. Although colonialism has never been ranked as > > any sort of a sucess in hindsight, a colonialist might just be a frontrunner of > > the inevitable tide. It might not be an ideal scenario, but it's not going away... > > Well, now we're getting really off track, but I think that XML does more > than any comparable Web specification to allow linguistic minorities to > maintain their language in the face of the other, inevitable changes like > mobility and cheap communication. XML is not the frontrunner of cultural > colonialism -- it is a moderating factor (albeit a small one). Indeed, and since I'm the one who got this started let me plead with everyone to please not get us started on the theory and practice of colonialism, fascinating as that may be. I was just wisecracking a little bit. Maybe I should end my emoticon boycott. Naaah. -- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Apr 30 14:52:49 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:00:58 2004 Subject: A SAX helper class: multiple document handlers In-Reply-To: <000301bd7388$326a0f80$1e09e391@mhklaptop.bra01.icl.co.uk> References: <000301bd7388$326a0f80$1e09e391@mhklaptop.bra01.icl.co.uk> Message-ID: <199804301252.IAA00404@unready.microstar.com> Michael Kay writes: > There was discussion during the SAX specification debate of > the requirement to notify multiple handlers of the same > event. > > I have written a simple (trivial!) handler which does this > (for document events only) and offer it as a candidate for a > standard "helper" class. Thank you, Michael. I have had to swat my hand away from the keyboard over and over again to avoid writing more helper classes myself. I think that there is room to write some very exciting stuff on top of SAX, either in a single large distribution or in several smaller ones, but I don't think that we should bloat the core distribution any further. It would be interesting, for example, to have a Beans event interface, like: void addStartElementListener (StartElementListener listener); void removeStartElementListener (StartElementListener listener); void addEndElementListener (EndElementListener listener); void removeEndElementListener (EndElementListener listener); etc. Programmers could take an invisible XML JavaBean, drag it into a BeanBox, and connect listeners using the mouse. Another implementation might allow the user to register objects (or even methods) to be invoked for specific element types. SAX is meant to provide the minimal bottom end, with only a couple of tiny helper classes to avoid excessive confusion among implementors (I anticipate that the persistence of attribute lists and locators will quickly become a FAQ); however, I strongly encourage other XML-DEV members to start creating new, higher-level interfaces that run on top of SAX, and (if they wish), to collect them into well-packaged distributions. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Thu Apr 30 14:54:09 1998 From: ht at cogsci.ed.ac.uk (Henry S. Thompson) Date: Mon Jun 7 17:00:58 2004 Subject: Final alpha release of XED: lots of new features In-Reply-To: "Michael Kay"'s message of Wed, 29 Apr 1998 10:23:36 +0100 References: <000b01bd7350$7d8b48e0$1e09e391@mhklaptop.bra01.icl.co.uk> Message-ID: "Michael Kay" writes: > - Pasting (Ctrl/V) when text is selected is an error, I > would expect it to paste the clipboard contents over the > selected text This was a bug introduced in the course of supporting pasting from other windows, now fixed. > - I thought at first that clicking "preferences" had no > effect, I discovered later it had brought up a window that > was not visible because it was behind others Should be better now. ht -- Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk URL: http://www.ltg.ed.ac.uk/~ht/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ray at guiworks.com Thu Apr 30 15:42:46 1998 From: ray at guiworks.com (Ray) Date: Mon Jun 7 17:00:58 2004 Subject: DOM Confusion Message-ID: <199804301341.HAA16676@coldsnap.guiworks.com> In http://www.w3.org/TR/WD-DOM/level-one-xml.html, the XMLNode interface is defined and explained as "the XML implementation of the Node interface adds some methods that are needed to manipulate specific features of XML documents" the interface is defined as interface XMLNode { Node getParentXMLNode(in boolean expandEntities); NodeIterator getChildXMLNodes(in boolean expandEntities); boolean hasChildXMLNodes(in boolean expandEntities); Node getFirstXMLChild(in boolean expandEntities); Node getPreviousXMLSibling(in boolean expandEntities); Node getNextXMLSibling(in boolean expandEntities); EntityReference getEntityReference(); EntityDeclaration getEntityDeclaration(); }; Which basically defines functions identical to the ones in the Node interface in the DOM core, except that now you can choose whether entities are expanded or not. (which is cool) However, I don't completely understand how this fits with the DOM core. Am I to assume that, if I am operating on an XML document, all functions that return a Node interface, actually hand me back something that is an XMLNode instead? Indeed, if I get back an Element, it is not Element : Node, but an Element : XMLNode? My confusion comes from the fact that XMLNode doesn't inherit from Node in the DOM spec. Thus, XMLNode doesn't have functions on it like Node.getFirstChild(), or does it? Since XMLNode itself defines functions like Node getParentXMLNode(in boolean expandEntities); Which return Node, not XMLNode, then either XMLNode does indeed inherit from Node (and the Node object returned is actually an XMLNode), OR, a compliant DOM interface must return objects that look like this class MyNode implements org.w3c.dom.Node, org.w3c.dom.XMLNode That way, they can be cast from Node <-> XMLNode, which seems ok, but nowhere in the DOM spec does it say that all Core objects will implement both interfaces if you are operating on XML. Given that I didn't pay my $5000, I can't really register my compliant on the WG list, but I have two questions: 1) If XML node doesn't inherit from Node, and is inherently specialized, why doesn't Node getParentXMLNode(in boolean expandEntities); for instance, return XMLNode. An why not define XMLElement, XMLPI, etc... that all inherit from XMLNode? 2) DOM seems to go from the "general to the specific" which is good, for instance, the core defines PI nodes, but HTML (more specific) has no PI elements. So why not dispense with XMLNode altogether and make the Node interface implement these functions? Simply define that expandEntities is "always true" for HTML documents, or some such. 3) extra nit: the function names are overly verbose and option #2 would solve it. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mcc at arbortext.com Thu Apr 30 16:13:36 1998 From: mcc at arbortext.com (Mike Champion) Date: Mon Jun 7 17:00:58 2004 Subject: DOM Confusion In-Reply-To: <199804301341.HAA16676@coldsnap.guiworks.com> Message-ID: <98Apr30.100957edt.26885@thicket.arbortext.com> At 09:41 AM 4/30/98 -0400, you wrote: > > >My confusion comes from the fact that XMLNode doesn't inherit from >Node in the DOM spec. Thus, XMLNode doesn't have functions on >it like Node.getFirstChild(), or does it? This was a typo (mine!). XMLNode should have been defined to inherit from Node. Sorry for all the confusion this has caused. > >So why not dispense with XMLNode altogether and make the Node >interface implement these functions? Simply define that expandEntities >is "always true" for HTML documents, or some such. Something like that is under consideration; I can't be more specific because the methods for doing this have not been nailed down yet. > > >3) extra nit: the function names are overly verbose and option #2 >would solve it. > Whimper .... the function names are verbose because of the astonishingly complex web of constraints that they must satisfy. "Brevity" is actually one consideration, but less important than clarity, internal consistency, avoiding clashes with existing APIs in widespread use, compatibility with the name scoping scheme in ECMAScript .... etc. Mike Champion xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ray at guiworks.com Thu Apr 30 16:29:53 1998 From: ray at guiworks.com (Ray) Date: Mon Jun 7 17:00:58 2004 Subject: DOM Confusion In-Reply-To: <98Apr30.100957edt.26885@thicket.arbortext.com> from Mike Champion at "Apr 30, 98 10:09:58 am" Message-ID: <199804301428.IAA16920@coldsnap.guiworks.com> > >My confusion comes from the fact that XMLNode doesn't inherit from > >Node in the DOM spec. Thus, XMLNode doesn't have functions on > >it like Node.getFirstChild(), or does it? > > This was a typo (mine!). XMLNode should have been defined to inherit from > Node. Sorry for all the confusion this has caused. Thanks for the correction Mike. The typo also extends to the Java language bindings on the W3C site (and also, I think both IBM XML4j and SAXDOM have this mistake too) > >So why not dispense with XMLNode altogether and make the Node > >interface implement these functions? Simply define that expandEntities > >is "always true" for HTML documents, or some such. > > Something like that is under consideration; I can't be more specific > because the methods for doing this have not been nailed down yet. No problem. Too bad you guys don't have the luxury of using function overloading. :) > >3) extra nit: the function names are overly verbose and option #2 > >would solve it. > Whimper .... the function names are verbose because of the astonishingly > complex web of constraints that they must satisfy. "Brevity" is actually > one consideration, but less important than clarity, internal consistency, > avoiding clashes with existing APIs in widespread use, compatibility with > the name scoping scheme in ECMAScript .... etc. I realize that, and I'm glad you guys are taking that into consideration. One of the problems I had on my last project was that I was using Lark, and I wanted to override the Element that Lark used to build trees (I had my reasons for letting Lark build the tree, relating to some third party code I had to interoperate with) with a subclass that implemented swing.tree.TreeNode, however TreeNode defines public abstract Enumeration children() but Lark's Element defines public Vector children() And as you know, you can't overload based on return types :( In such cases, Java's inner classes yield a solution, which is to use internal delegate classes public class SuperNode implements TreeNode { public Node getDOMNode() { ... } class DOMAdapter implements Node { .... } } -Ray xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mcc at arbortext.com Thu Apr 30 16:56:11 1998 From: mcc at arbortext.com (Mike Champion) Date: Mon Jun 7 17:00:58 2004 Subject: DOM Confusion In-Reply-To: <199804301428.IAA16920@coldsnap.guiworks.com> References: <98Apr30.100957edt.26885@thicket.arbortext.com> Message-ID: <98Apr30.105254edt.26886@thicket.arbortext.com> At 10:28 AM 4/30/98 -0400, Ray wrote: >> This was a typo (mine!). XMLNode should have been defined to inherit from >> Node. Sorry for all the confusion this has caused. > >Thanks for the correction Mike. The typo also extends to the >Java language bindings on the W3C site (and also, I think both IBM >XML4j and SAXDOM have this mistake too) Something you all might want to know is that the DOM Core and XML API descriptions are written in XML. There are tag sets for class, method, attribute, etc. descriptions that we use to define a single "master" description of each part of the API, and we have a set of scripts that generate the IDL and Java language bindings as well as the HTML "publication" of the spec from the master XML source. [It's actually an extremely cool application that demonstrates the power of XML, IMHO; credit for it goes largely to Gavin Nichol]. This means that the various definitions of the API are sure to be *consistent*, but if the master XML is wrong, they all are wrong. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at arbortext.com Thu Apr 30 17:03:41 1998 From: paul at arbortext.com (Paul Grosso) Date: Mon Jun 7 17:00:58 2004 Subject: DOM Confusion Message-ID: <98Apr30.105659edt.26885@thicket.arbortext.com> At 09:41 1998 04 30 -0400, Ray wrote: > >In http://www.w3.org/TR/WD-DOM/level-one-xml.html, ... > >However, I don't completely understand how this fits with the DOM core. > >. . . >Given that I didn't pay my $5000, I can't really register my compliant >on the WG list, but I have two questions: . . . As very clearly stated on the public DOM page : Questions, comments, and suggestions about the DOM Although questions about the DOM may be posted in other forums, it would be best to post them to the public mailing list at www-dom@w3.org. To subscribe, send mail to www-dom-request@w3.org with the subject "subscribe". It costs nothing, and your comments are actively solicited, so let's all stop spending energy making snide cracks about elitism, and join the party and help. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From David.Brownell at Eng.Sun.COM Thu Apr 30 17:17:36 1998 From: David.Brownell at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:00:58 2004 Subject: DOM Confusion Message-ID: <199804301514.IAA23641@argon.eng.sun.com> > Something you all might want to know is that the DOM Core and XML API > descriptions are written in XML. I like it! Could the next draft update the HTML portion to work that way? That'd help make it work more like one specification, and address a few of the biggest problems I perceive in that part of the spec. > This means that the various > definitions of the API are sure to be *consistent*, but if the master XML > is wrong, they all are wrong. As shown by the exception non-declarations ... I thought I detected the actions of some automated tool! :-) - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Thu Apr 30 17:55:28 1998 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 17:00:58 2004 Subject: Separation of formatting... Message-ID: <199804301555.QAA00020@GPO.iol.ie> The "separation of formatting..." mantra is a big part of SGML/XML obviously. However, it works on a number of levels. Here is a pieces of data marked up three ways:- Version 1 : Purely Formatting Mindset (RTF) "{\i Customer} Joe Bloggs \par" Version 2 : SGML - Generic Markup

CustomerJoe Bloggs

Version 3 : SGML - Data Modelling Joe Bloggs I think for some poople, SGML is all about Version 2 above. Entire books have been written that use SGML to abstract the concepts of "paragraph", "artwork" etc. from the typographic codes required to achieve the result. (The DTD in Annex E of the handbook is an example of this sort of mindeset). Somewhere along the line, people started thinking as in version 3 above. I have no idea when this started to happen. Anyone out there know? Sean Mc Grath http://www.digitome.com/sean.htm County Sligo, Ireland, Tel: +353 96 47391 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Thu Apr 30 18:50:24 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:00:58 2004 Subject: Separation of formatting... References: <199804301555.QAA00020@GPO.iol.ie> Message-ID: <3548AB44.2439@hiwaay.net> Sean Mc Grath wrote: > > The "separation of formatting..." > mantra is a big part of SGML/XML obviously. Well, no. It is a big mantra often recited by folks who apply SGML. SGML does not care. It is important to keep that distinction alive because otherwise one sinks into the "holiness of the page metaphor" debates. Those debates usually end in a discussion of target granularity and searching requirements. Pixel/raster memory does not care either. > However, it works on a number of levels. > > Here is a pieces of data marked up three ways:- > > Version 1 : Purely Formatting Mindset (RTF) > "{\i Customer} Joe Bloggs \par" > > Version 2 : SGML - Generic Markup >

CustomerJoe Bloggs

> > Version 3 : SGML - Data Modelling > Joe Bloggs > Somewhere along the line, people started thinking > as in version 3 above. I have no idea when this > started to happen. Anyone out there know? The first large application of it that I remember was the US DOD Content Data Model for IETMs. We did similar things in the US Navy CASS application. I used it for the US Army IADS DTDs after the CASS program moved on. In many cases, it was to help the author use the editor more precisely by indicating in the menus what the precise content had to be. As we used this approach, we found that it helped with searching and target granularity plus had excellent lifecycle characteristics, ie, less information is lost when one does not have to archive a down-translated chunk of information. It suffers when one has to aggregate data from multiple sources without namespaces under a root. For that, we have traditionally applied switch tags in the DTD, and while that works, it won't work automatically which I believe is the requirement for namespaces. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Thu Apr 30 18:52:50 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:00:59 2004 Subject: A SAX helper class: multiple document handlers Message-ID: <008001bd744b$f2d29c60$1e09e391@mhklaptop.bra01.icl.co.uk> >Another implementation might allow the user to register objects (or >even methods) to be invoked for specific element types. I have been making heavy use of a layer of code that does just that; I am busy right now rewriting it for SAX 1.0 (and to fix all the things I got wrong first time round) and hope to publish it asap. This is certainly a quite different animal from SAX and belongs as an separate layer on top. Mike xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ray at guiworks.com Thu Apr 30 18:54:18 1998 From: ray at guiworks.com (Ray) Date: Mon Jun 7 17:00:59 2004 Subject: DOM Confusion In-Reply-To: <98Apr30.100957edt.26885@thicket.arbortext.com> from Mike Champion at "Apr 30, 98 10:09:58 am" Message-ID: <199804301651.KAA18314@coldsnap.guiworks.com> Another DOM question... The Element interface is defined to inherit from Node XMLNode is defined to inherit from Node. However, if one is operating on an XML document, is the Element interface assumed to inherit from XMLNode? We seem to have the case (DOM Core Mode) interface Element : Node interface Text : Node (DOM XML Mode) interface Element : XMLNode interface Text : XMLNode .... interface XMLNode : Node It seems like there is a conflict? The problem is, there are two object hierarchies, with identical names and packages (org.w3c.dom), but which have different base classes. Multiple inheritence would work if XMLNode didn't extend Node, as in public interface Element extends Node, XMLNode But if it does, then all we need is public interface Element extends XMLNode which to me indicates that XMLNode is best done away with and subsumed into Node. Regards, -Ray xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mcc at arbortext.com Thu Apr 30 19:36:38 1998 From: mcc at arbortext.com (Mike Champion) Date: Mon Jun 7 17:00:59 2004 Subject: DOM Confusion In-Reply-To: <199804301651.KAA18314@coldsnap.guiworks.com> References: <98Apr30.100957edt.26885@thicket.arbortext.com> Message-ID: <98Apr30.133323edt.26883@thicket.arbortext.com> At 12:51 PM 4/30/98 -0400, Ray wrote: > >Another DOM question... > > >The Element interface is defined to inherit from Node > >XMLNode is defined to inherit from Node. > >However, if one is operating on an XML document, is the Element >interface assumed to inherit from XMLNode? > ... Good point. I'll make sure that the DOM WG considers this. But please note that there is a public mailing list www-dom@w3.org that is a better place for detailed discussions of the DOM spec; this message has been forwarded to that list, and you might want to subscribe and continue the discussion there. Thanks, Mike Champion xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From David.Brownell at Eng.Sun.COM Thu Apr 30 20:13:14 1998 From: David.Brownell at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:00:59 2004 Subject: DOM Confusion Message-ID: <199804301811.LAA23782@argon.eng.sun.com> > Multiple inheritence would work if ... In the system being used to define these interfaces, OMG-IDL, multiple inheritance of interfaces (types) is just fine; there is no "if" required! It's the same as in Java, or with C++ virtual public inheritance: interface Element : Node { ... } interface XMLNode : Node { ... } interface XMLElement : XMLNode, Element { ... } Not that there's an "XMLElement" interface defined now. Don't confuse this with the "single implementation inheritance" rule that most systems (other than C++) follow. Now, as for the ECMA Script bindings ... I'm not an expert in how the DOM WG is mapping IDL to ECMA Script. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at ora.com Thu Apr 30 23:42:47 1998 From: crism at ora.com (Chris Maden) Date: Mon Jun 7 17:00:59 2004 Subject: SAX 1.0beta: Three bugs so far In-Reply-To: <199804300045.UAA00461@unready.microstar.com> (message from David Megginson on Wed, 29 Apr 1998 20:45:49 -0400) Message-ID: <199804302141.RAA09755@ruby.ora.com> [David Megginson] > BUG #2: Parser.setLocale takes only one String argument > > As will quickly become apparent, I am not an expert in > localisation. I have discovered that localisation requires both > a language code _and_ a country code, so I have changed the > interface prototype to > > public abstract void setLocale (String language, String country) > throws SAXException; > > Does this look correct? Would people prefer that I use the > java.util.Locale class? I think a single string, or unspecified parts, would be better. XML allows RFC 1766 language identifiers, which can include i-cherokee and x-klingon. The language-country form is only one class of valid language identifier. -Chris -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)