\2.

.... the list goes on for ever I would MUCH prefer to see when a specific tag ends rather than just seeing an end. I often look through files which were marked up in short tag (proprietary format) and I will see 20 's in a row. Which are still open? When debugging a file that was incorrectly tagged and doesn't parse there is NO way to tell which tags are open and which are closed! If I spend 30 mins going through and attempting to find which elements were closed and which ones were missed it is hardly an easy process. Long tags make debugging/proofreading much less laborious and I would contend that most people can read and process a long end tag as quickly, if not more quickly than a short tag. This text is more readable than the next example, and is easier for a human to process and follow. This text is LESS readable than the previousexample, and is easier for a human to process and follow. In the latter case the human has to look back and see which tags are open in order to understand which are being closed. In the former, it is clear which tags are being opened and which are being closed. Once again, sorry for the HTML format I was using, I didn't even realize until Peter pointed it out... a number of times. :) Michael Alaly alaly@inlink.com | 314.878.6474 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu May 14 00:04:04 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:01:19 2004 Subject: WD of XSL requirements released In-Reply-To: <199805131921.MAA26860@boethius.eng.sun.com> Message-ID: <3.0.1.16.19980513230042.5fe7a896@pop3.demon.co.uk> At 12:21 13/05/98 -0700, Jon Bosak wrote: >The first working draft of the requirements for XSL has just been made >public: > > http://www.w3.org/TR/1998/WD-XSLReq-19980511 [amended] This is impressive! There is clearly a great deal of work to be done and we wish the editors and their support group a lot of energy and good humour. Please do *NOT* discuss this document on XML-DEV; I assume that xsl-list will give pointers as to how feedback can be obtained. I can see that in a few cases (e.g. interactivity, objects) discussion on XML-DEV (and, better, prototypes :-) might be useful - but we should wait to be asked in the first instance. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From kent at trl.ibm.co.jp Thu May 14 02:00:53 1998 From: kent at trl.ibm.co.jp (TAMURA Kent) Date: Mon Jun 7 17:01:19 2004 Subject: IBM XML for Java has been updated. Message-ID: <199805132359.IAA15499@ns.trl.ibm.com> XML for Java, an XML processor in Java, has been updated. http://www.alphaworks.ibm.com/formula/xml Changes: 16-Apr-1998 to 13-May-1998 o Updated DOM spec. [16-Apr-1998] o Supports SAX 1.0 o Supports UTF-16 encoding o New factories: com.ibm.xml.parser.util.TreeFactory and com.ibm.xml.parser.util.XHFactory Added new sample: TreeView for TreeFactory o Restructed some methods in TXElement o Subclasses of Child request some jobs to ElementFactory o And fixed some bugs -- TAMURA, Kent @ Tokyo Research Laboratory, IBM Japan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Thu May 14 07:15:52 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:01:19 2004 Subject: XMLTest updated Message-ID: <355A75BE.D306F160@jclark.com> I've updated by XMLTest program (http://www.jclark.com/xml/XMLTest.java) to support SAX 1.0. XMLTest uses SAX to generate canonical XML, allowing any Java XML parser with a SAX driver to be tested with any XML test suite that includes canonical XML output for its test cases. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Thu May 14 07:15:58 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:01:19 2004 Subject: Expat updated Message-ID: <355A7AE8.DAA82112@jclark.com> A new version of expat, my XML parser in C, is now available. See http://www.jclark.com/expat.html for more information. This version adds support for parsing external general entities. There's a new -x option on xmlwf that enables this. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Thu May 14 07:16:08 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:01:19 2004 Subject: XP 0.3 available Message-ID: <355A76B7.4A73038@jclark.com> XP 0.3 is now available. See http://www.jclark.com/xml/xp/index.html for more information. Apart from bug fixes, the changes in this release are: - Support for SAX 1.0. - Efficient support for large CDATA sections (previously these were not handled very efficiently). - New approach to exceptions in the interface. If you have code using XP 0.2, you can get it to work with XP 0.3 just by changing it to import Application, ApplicationImpl, Parser and ParserImpl from com.jclark.xml.parse.io instead of com.jclark.xml.parse. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gotou at flab.fujitsu.co.jp Thu May 14 07:24:03 1998 From: gotou at flab.fujitsu.co.jp (Masatomo Goto) Date: Mon Jun 7 17:01:19 2004 Subject: Xlink semantics In-Reply-To: <3.0.1.16.19980513204547.5bdf6c46@pop3.demon.co.uk> Message-ID: <199805140517.OAA28951@kmailserv.akashi.flab.fujitsu.co.jp> Hello, I'm now working on design and implementation of a XLink facility. My approach is to provide the XLink facility as an engine. At 20:45 98/05/13, Peter Murray-Rust wrote: > I have been hacking an application (the VHG DTD) using Xlink and I'd like > to check on some semantics. Since the spec has contentSpecs of ANY for all > three link-types the formal situation is that anything is permitted. Should > my software complain at the following examples (notation should be > obvious), or should it try to do something clever? In my Xlink engine, these examples are recognized as follows: > > > > > - One extended link which has no or only inline resource (depends on inline attribute value) - Two simple links which are in the extended link content > ... > > > > > > - One extended link which has two or three(inline) resources. - One simple link which is in the extended link content > ... > > > > - One extended link which has one or two (inline) resources. > ... > > > > > > > > > > > - One extended link which has no or only inline resources - two extended links which have two or three (inline) resources in the parent extended link's content > ... > > > > > > ... - no meaning. no processing. > The problem is that all of these throw no error in the parser as they are > probably impossible to constrain except in very spartan DTDs. I suspect > most are not productive, but some might be valuable on occasions I have or > haven't thought of. If the XLink processing facilities are separated from the application, It is possible to throw some errors from the "XLink processor". > This is an important occasion that there is a clear requirement for > applications to apply semantics to parts of one of the specs. We already > have to write an attribute processor and I'm interested in knowing how much > additional processing any conforming Xlink software is going to have to do. FYI, I will give a speach and demonstration about my XLink engine in the HyTime at work session of SGML/XML Europe '98. Best regards, - Masatomo Goto --- Masatomo Goto Information Service Architecture lab. Fujitsu Laboratories Ltd. Tel: +81-78-934-8249 Fax: +81-78-934-3312 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Thu May 14 08:43:56 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:01:19 2004 Subject: Expat updated References: <355A7AE8.DAA82112@jclark.com> Message-ID: <355A9130.17D3D6F7@jclark.com> James Clark wrote: > > A new version of expat, my XML parser in C, is now available. See > > http://www.jclark.com/expat.html > > for more information. Sorry, that should have been: http://www.jclark.com/xml/expat.html James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu May 14 09:16:52 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:01:19 2004 Subject: XP 0.3 available In-Reply-To: <355A76B7.4A73038@jclark.com> Message-ID: <3.0.1.16.19980514075908.5bb7db9e@pop3.demon.co.uk> At 11:44 14/05/98 +0700, James Clark wrote: >XP 0.3 is now available. See > > http://www.jclark.com/xml/xp/index.html > >for more information. > >Apart from bug fixes, the changes in this release are: > >- Support for SAX 1.0. > Many thanks. Astute readers will have seen that this is just one of three announcements today of updated software from James, and the XML-DEV community owes him an enormous debt for his commitment. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu May 14 09:25:36 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:01:19 2004 Subject: Xlink semantics In-Reply-To: <199805140517.OAA28951@kmailserv.akashi.flab.fujitsu.co.jp> References: <3.0.1.16.19980513204547.5bdf6c46@pop3.demon.co.uk> Message-ID: <3.0.1.16.19980514080952.514748b8@pop3.demon.co.uk> At 14:20 14/05/98 +0900, Masatomo Goto wrote: > >Hello, > >I'm now working on design and implementation of a XLink facility. >My approach is to provide the XLink facility as an engine. This is wonderful! From time to time I have suggested such an engine on XML-DEV and it is great to see someone as well advanced as this. Do you plan to make this available? If not, are you able to publish the semantics? [... examples and replies snipped ...] This was very helpful. From your replies, you appear to attach some meaning to all except the last (which I think we would all agree was a semantic error). I'd be very interested to know if other Xlink specialists (including the authors?) take the same view as you. >If the XLink processing facilities are separated from the application, >It is possible to throw some errors from the "XLink processor". I agree with this strategy. I have been campaigning for an XLink processor :-) > >> This is an important occasion that there is a clear requirement for >> applications to apply semantics to parts of one of the specs. We already >> have to write an attribute processor and I'm interested in knowing how much >> additional processing any conforming Xlink software is going to have to do. > >FYI, I will give a speach and demonstration about my XLink engine in >the HyTime at work session of SGML/XML Europe '98. I have recently met a good fairy which means that I shall be at XML98 for part of the time (probably Tuesday - Thursday). It would be nice to meet. P. > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu May 14 09:27:53 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:01:19 2004 Subject: IBM XML for Java has been updated. In-Reply-To: <199805132359.IAA15499@ns.trl.ibm.com> Message-ID: <3.0.1.16.19980514075624.0ac78214@pop3.demon.co.uk> At 08:59 14/05/98 +0900, TAMURA Kent wrote: > >XML for Java, an XML processor in Java, has been updated. > >http://www.alphaworks.ibm.com/formula/xml > > >Changes: 16-Apr-1998 to 13-May-1998 > o Updated DOM spec. [16-Apr-1998] > o Supports SAX 1.0 > o Supports UTF-16 encoding > o New factories: com.ibm.xml.parser.util.TreeFactory > and com.ibm.xml.parser.util.XHFactory > Added new sample: TreeView for TreeFactory > o Restructed some methods in TXElement > o Subclasses of Child request some jobs to ElementFactory > o And fixed some bugs > >-- >TAMURA, Kent @ Tokyo Research Laboratory, IBM Japan > > Many thanks - both for making your software SAX-compliant and for a very well presented posting with exactly the right amount of information. I have not snipped it so that others can see the volume of information that is sufficient for most readers to decide whether they wish to download a README or the whole software. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at dvs1.informatik.tu-darmstadt.de Thu May 14 09:44:21 1998 From: rbourret at dvs1.informatik.tu-darmstadt.de (Ron Bourret) Date: Mon Jun 7 17:01:19 2004 Subject: EDI to XML Message-ID: <199805140737.JAA21449@berlin.dvs1.tu-darmstadt.de> See http://www.geocities.com/WallStreet/Floor/5815/ > I am new to the XML world, but do lot of EDI-VAN based transactions > with our trading community & in a process of defining long-term > strategy to do these transactions over the Web/internet. > > I have started hearing that XML is a replacement of EDI over the Web. > could anyone suggest me, how XML would be used to replace EDI in terms > of what all componenets/tools, steps to be taken in developers & > planners point of view. Is that some product available in the market > which could be used now etc. > > any comment would be appreciated. > > -Arun V. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Thu May 14 12:37:29 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:01:20 2004 Subject: White space after last element Message-ID: <000a01bd7f24$02de1160$1e09e391@mhklaptop.bra01.icl.co.uk> I just tested SAXON with the new version of IBM's xml4j parser: it crashes because SAXON doesn't expect white space to be reported after the end of the outermost element. I'll fix SAXON so it doesn't crash, but it raises a wider point, because we now have several SAX-conformant parsers reporting white space differently, and I'm not sure whether the spec says clearly which of them is right (perhaps they all are). Any comments from the experts? Mike xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tms at ansa.co.uk Thu May 14 13:16:02 1998 From: tms at ansa.co.uk (Toby Speight) Date: Mon Jun 7 17:01:20 2004 Subject: A little wish for short end tags In-Reply-To: Paul Prescod's message of "Wed, 13 May 1998 07:51:00 -0400" References: <35586A0E.4A785960@mecom.mixx.de> <35598924.AC686153@technologist.com> Message-ID: Paul> Paul Prescod Paul> I do not believe that there is any way ot implementat a legal XML Paul> parser without keeping around all of the information required to Paul> implement short end tags. Checking that an end-tag matches its Paul> start-tag (the current situation) is no easier than not checking. But there are plenty of (non-parsing) applications that benefit from XML standard end-tags. An obvious one is selection of an element from a document; a regexp search for the start-tag, and then just match start and end tags *for that element type*, keeping track of depth *for that element type* (we don't even need to do that if the element type is known not to be nestable in itself). That application need not even notice tags for other element types. -- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Thu May 14 13:56:26 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:01:20 2004 Subject: Parser benchmark result Message-ID: <004e01bd7f2f$758542a0$1e09e391@mhklaptop.bra01.icl.co.uk> At the risk of making myself unpopular by looking gift horses in their mouths, I report the results of running a simple SAX 1.0 application with five different parsers. They were all run using the default SAXON Renderer application (which simply reconstitutes the XML input file supplied) against an XML file containing 291089 bytes, or 11500 elements, each on the same machine (a reasonably powerful Windows NT server, with the SUN Java VM), which was otherwise idle (as far as one can tell). I measured elapsed time by calling java.util.Date#getTime() at the start and end of the run. Each was run twice, I report both results. Elapsed time in milliseconds: AElfred: 8203, 8215 Lark: 10422, 10453 MSXML: 13250, 12250 xml4j: 18125, 18313 xp: 8156, 7907 These were obtained after some performance tuning on SAXON: the issued version takes about twice as long. Of course, there are many attributes to a piece of software other than its raw speed, and the results cannot necessarily be extrapolated to a different application or a different data file. The really good news is that (with the one caveat noted in an earlier message) the application worked unchanged on all five parsers. Mike Kay, ICL xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu May 14 14:42:49 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:01:20 2004 Subject: SAX: White space after last element In-Reply-To: <000a01bd7f24$02de1160$1e09e391@mhklaptop.bra01.icl.co.uk> References: <000a01bd7f24$02de1160$1e09e391@mhklaptop.bra01.icl.co.uk> Message-ID: <199805141242.IAA00235@unready.microstar.com> Michael Kay writes: > I just tested SAXON with the new version of IBM's xml4j > parser: it crashes because SAXON doesn't expect white space > to be reported after the end of the outermost element. I'll > fix SAXON so it doesn't crash, but it raises a wider point, > because we now have several SAX-conformant parsers reporting > white space differently, and I'm not sure whether the spec > says clearly which of them is right (perhaps they all are). The JavaDoc comment for DocumentHandler.ignorableWhitespace begins with the line Receive notification of ignorable whitespace in element content. ^^^^^^^^^^^^^^^^^^ In other words, the method should be invoked only between the first startElement event and the last endElement event (and even then, only when there is a DTD and the parent element type is declared with element-only content). SAX is very new, and I am very tired, after a difficult week (of which SAX 1.0 was only a tiny part). When I am better rested, I will take some time to test the different SAX implementations and to work with the authors to resolve any problems, or to take suggestions for clarifying the interface. While Java interfaces and JavaDoc comments are useful, they are no substitute for a proper written specification, which I owe to all of you as soon as I can manage it. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From amitr at abinfosys.com Thu May 14 14:53:05 1998 From: amitr at abinfosys.com (Amit Rekhi) Date: Mon Jun 7 17:01:20 2004 Subject: Accessing default attributes from DTD Message-ID: <3.0.5.32.19980514145544.00972d20@192.168.1.1> Hello All, I want to access default attributes of elements of an XML file(say A.xml) defined in A.xml's DTD (A.dtd) in an HTML file (say A.html) which invokes the MSXML java parser and loads the A.xml file. Let me explain my scenario The A.dtd contains :- ....... ......... The A.xml file contains :- ..... .... ...... In my A.html file I want to access 's default attribute ATTR-1 (defined in A.dtd) after I parse the A.xml file into a tree using MSXML java parser. How would I do that??? Any suggestions would help, Thanx in Advance, AMIT xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Thu May 14 15:30:55 1998 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:01:20 2004 Subject: Accessing default attributes from DTD In-Reply-To: <3.0.5.32.19980514145544.00972d20@192.168.1.1> References: <3.0.5.32.19980514145544.00972d20@192.168.1.1> Message-ID: * Amit Rekhi | | The A.dtd contains :- | ....... | | ......... | | In my A.html file I want to access 's default attribute ATTR-1 | (defined in A.dtd) after I parse the A.xml file into a tree using | MSXML java parser. | | How would I do that??? A conforming XML parser that reads the DTD will behave as if the ATTR-1 attribute had been present on every TEMP element, so you don't need access to the DTD or any other kind of magic. If MSXML doesn't do this I suggest you use a parser that does. You can find a list of parsers at (Also note that XML doesn't have a "NUMBER" attribute type. You'll have to use CDATA.) -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu May 14 15:49:45 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:01:20 2004 Subject: SAX 1.0: Parsers and Applications Message-ID: <199805141349.JAA00505@unready.microstar.com> I've updated the Parsers and Applications page for SAX 1.0, after the flurry of announcements over the past couple of days (most of which only just arrived because of problems with my POP server). You will find the page at the following URL: http://www.megginson.com/SAX/applications.html Note that AElfred does not appear there yet because it has not been updated from SAX 1.0gamma -- only minor changes will be necessary, though. Thanks to everyone for the support. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jriedese at jnana.com Thu May 14 16:05:46 1998 From: jriedese at jnana.com (Joel Riedesel) Date: Mon Jun 7 17:01:20 2004 Subject: SAX, DTDs, parsing... In-Reply-To: <199805141349.JAA00505@unready.microstar.com> Message-ID: <005201bd7f41$244a1ac0$0132fac1@ipusushi> Please excuse my ignorance; I'm just starting to dive into code for XML stuff... Would someone direct me to a good overview of SAX? Also, I'm looking for a decent java validating XML parser that lets me get at the DTD for parsed XML files. Any suggestions? As far as I can tell, it looks as the only thing close is Lark or DXP, is that correct? Thanks. --- Joel Riedesel Jnana Technologies Corporation http://www.jnana.com mailto:jriedese@jnana.com 303 805 8275 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Thu May 14 16:20:57 1998 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 17:01:20 2004 Subject: NFF - Notes Flat File Format Initiative + Freely available software Message-ID: <199805141419.PAA09759@GPO.iol.ie> Announcement ============ A New XML Application Initiative : Notes Flat File Format (NFF) NFF is an XML based interchange format for the Lotus Notes/Domino platform. The DTD, basic documentation, sample + software are available for download at : http://www.digitome.com/download.htm NFF is an XML based application intended to act as an interchange format for the Lotus Notes/Domino Publishing platform. The NFF DTD supports the majority of the constructs that can occur in Lotus Notes data such as structured fields, rich text, doclinks, import objects and so on. Once data is in XML conforming to the NFF DTD it can be imported into Lotus Notes using a simple "File-Import". The download package includes the necessary software along with a sample application - Timon Of Athens by William Shakespeare in NFF format. The software is freely available and comes with absolutely no warranty whatsoever. All constructive feedback welcome! Sean Mc Grath http://www.digitome.com/sean.htm County Sligo, Ireland, Tel: +353 96 47391 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at dvs1.informatik.tu-darmstadt.de Thu May 14 17:20:42 1998 From: rbourret at dvs1.informatik.tu-darmstadt.de (Ron Bourret) Date: Mon Jun 7 17:01:20 2004 Subject: SAX, DTDs, parsing... Message-ID: <199805141418.QAA24119@berlin.dvs1.tu-darmstadt.de> MSXML is a validating parser written in Java. You can get at the DTD, but only as XML-Data elements. Of course, they ship the code, so you could probably go straight to their internal structures. -- Ron Bourret > Also, I'm looking for a decent java validating XML parser that > lets me get at the DTD for parsed XML files. Any suggestions? > As far as I can tell, it looks as the only thing close is > Lark or DXP, is that correct? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jriedese at jnana.com Thu May 14 17:37:35 1998 From: jriedese at jnana.com (Joel Riedesel) Date: Mon Jun 7 17:01:20 2004 Subject: SAX, DTDs, parsing... In-Reply-To: <199805141418.QAA24119@berlin.dvs1.tu-darmstadt.de> Message-ID: <005801bd7f4d$e1b4e520$0132fac1@ipusushi> I didn't find it that simple to get at the DTD data. And, it appeared to me that the regular expression parts of DTD elements weren't really accessible in any clean API fashion. Yes, I can muck with the source, but I'd rather not; especially since I thought I've seen reports of bugginess on the part of MSXML - I don't want to have to start maintaining it in order to do what I want to do. Ideally, I'd like to throw a DTD file at a validating XML parser, in addition to a normal XML file which references an external DTD; and in both cases, get details of the DTD out of the parser. (Lots and lots of details!) I also noticed, that with MSXML, if the DTD is external to the XML; you can't get to the DTD elements from the parsed document. That 'node' isn't available to them main tree once parsing is done. I found that to be a bit annoying too. > > MSXML is a validating parser written in Java. You can get at > the DTD, but only > as XML-Data elements. Of course, they ship the code, so you > could probably go > straight to their internal structures. > > -- Ron Bourret > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu May 14 18:03:33 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:01:20 2004 Subject: SAX, DTDs, parsing... In-Reply-To: <005201bd7f41$244a1ac0$0132fac1@ipusushi> References: <199805141349.JAA00505@unready.microstar.com> <005201bd7f41$244a1ac0$0132fac1@ipusushi> Message-ID: <199805141602.MAA00202@unready.microstar.com> Joel Riedesel writes: > > Please excuse my ignorance; I'm just starting to dive into code for XML > stuff... > > Would someone direct me to a good overview of SAX? Here's an intro: http://www.megginson.com/SAX/quickstart.html > Also, I'm looking for a decent java validating XML parser that > lets me get at the DTD for parsed XML files. Any suggestions? > As far as I can tell, it looks as the only thing close is > Lark or DXP, is that correct? AElfred provides (non-SAX) methods for accessing the DTD, and I think that IBM's XML for Java might as well. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecom.mixx.de Thu May 14 18:24:17 1998 From: James.Anderson at mecom.mixx.de (james anderson) Date: Mon Jun 7 17:01:20 2004 Subject: Short Tags and (ui standards) References: <02ff01bd7eba$dd551740$06001cac@mike> Message-ID: <355B1B06.18DF4604@mecom.mixx.de> Michael Alaly wrote: > > This text is more readable than the > next example, and is easier > for a human to process and follow. > > This text is LESS readable than the > previousexample, and is easier for a human to > process and follow. > IF i have to read marked up source, then i try to make sure that it looks more like This text is more readable than either example, and is easier for a human to process and follow. which is not markedly easier to read than this This text is NO less readable than the previous example and easier to read than either of the running-textexamples, should it be necssary for a human to process and follow it. and, in either case, significantly better than running text with strictly width-constrained line-breaks. if you really have to read marked up source, you're working against the odds unless it's the shape of the text body, not the content of the end tag, which matters. the more general point is that, the argument for long tags for the purpose of human readability has little credibility, since one should place minimal standards on the tools which present the marked up source to do so in a way (ie with meaningful justification) which, as a side-effect, renders the end tag content redundant. i don't have to do with many editors, but the one which i do use (visual page) tends to enforce such ("pretty printing") conventions. which i think was the correct decision and well worth the extra implementation effort. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Thu May 14 21:11:53 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:01:20 2004 Subject: WD of XSL requirements released In-Reply-To: <3559FAE0.C491DA17@interlog.com> (message from Ben Trafford on Wed, 13 May 1998 15:56:16 -0400) Message-ID: <199805141908.MAA27070@boethius.eng.sun.com> [Ben Trafford:] | Jon Bosak wrote: | | > The first working draft of the requirements for XSL has just been made | > public: | > | > http://www.w3.org/TR/1998/WD-XSLReq | | Oops! This URL is incorrect. The actual URL is: | | http://www.w3.org/TR/WD-XSLReq Sorry, I was forwarding an announcement without checking. Thanks for catching this. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjaakkol at cs.Helsinki.FI Thu May 14 22:05:56 1998 From: jjaakkol at cs.Helsinki.FI (Jani Jaakkola) Date: Mon Jun 7 17:01:20 2004 Subject: A little wish for short end tags In-Reply-To: Message-ID: On 14 May 1998, Toby Speight wrote: > Paul> Paul Prescod > > Paul> I do not believe that there is any way ot implementat a legal XML > Paul> parser without keeping around all of the information required to > Paul> implement short end tags. Checking that an end-tag matches its > Paul> start-tag (the current situation) is no easier than not checking. > > But there are plenty of (non-parsing) applications that benefit from > XML standard end-tags. An obvious one is selection of an element from > a document; a regexp search for the start-tag, and then just match > start and end tags *for that element type*, keeping track of depth > *for that element type* (we don't even need to do that if the element > type is known not to be nestable in itself). That application need > not even notice tags for other element types. Yes. I've got now an indexing and SGML/XML aware version of sgrep under development (it works, but isn't ready for release yet). It does not do real XML-parsing; it only scans files for indexes of start and end tags other markup. Therefore it is also very fast. It wouldn't work (at least as well) with short end tags. However, writing a normalizer for short end tags in XML would be trivial, unlike writing a normalizer for all possible short tags in SGML (thank you James). So if someone wants to use short end tags while authoring or internally in some organization i'd say that go for it. But please, keep the standard small and simple. - Jani xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From kent at trl.ibm.co.jp Fri May 15 03:03:45 1998 From: kent at trl.ibm.co.jp (TAMURA Kent) Date: Mon Jun 7 17:01:20 2004 Subject: SAX: White space after last element In-Reply-To: David Megginson's message of "Thu, 14 May 1998 08:42:33 -0400" <199805141242.IAA00235@unready.microstar.com> References: <000a01bd7f24$02de1160$1e09e391@mhklaptop.bra01.icl.co.uk> <199805141242.IAA00235@unready.microstar.com> Message-ID: <199805150101.KAA37087@ns.trl.ibm.com> > > I just tested SAXON with the new version of IBM's xml4j > > parser: it crashes because SAXON doesn't expect white space > > to be reported after the end of the outermost element. I'll > > fix SAXON so it doesn't crash, but it raises a wider point, > > because we now have several SAX-conformant parsers reporting > > white space differently, and I'm not sure whether the spec > > says clearly which of them is right (perhaps they all are). > The JavaDoc comment for DocumentHandler.ignorableWhitespace begins > with the line > Receive notification of ignorable whitespace in element content. > ^^^^^^^^^^^^^^^^^^ I'm sorry for the bug. It is fixed by the following patch. --- SAXDriver.19980512 Fri May 15 09:49:49 1998 +++ SAXDriver.java Fri May 15 09:51:46 1998 @@ -240,8 +240,11 @@ return 0 > ind ? null : getValue(ind); } + + int depth = 0; // parser.TagHandler public void handleStartTag(TXElement el, boolean empty) { + this.depth ++; m_attributes = el.getAttributeArray(); try { m_documenthandler.startElement(el.getName(), this); @@ -252,6 +255,7 @@ } // parser.TagHandler public void handleEndTag(TXElement el, boolean empty) { + this.depth --; try { m_documenthandler.endElement(el.getName()); } catch (SAXException e) { @@ -261,11 +265,13 @@ // parser.DefaultElementFactory public TXText createText(String data, boolean ignorable) { try { + if (0 < this.depth) { char[] ac = data.toCharArray(); if (ignorable) m_documenthandler.ignorableWhitespace(ac, 0, ac.length); else m_documenthandler.characters(ac, 0, ac.length); + } } catch (SAXException e) { throw new ExceptionWrapper(e); } -- TAMURA, Kent @ Tokyo Research Laboratory, IBM Japan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From kendal at interlog.com Fri May 15 03:41:50 1998 From: kendal at interlog.com (Rolande Kendal) Date: Mon Jun 7 17:01:20 2004 Subject: do(duration){display video} Message-ID: <3.0.32.19980514213837.015e6bb0@interlog.com> I wish to include SMIL/PGML like encoded characteristics in a file; however, in addition I require some means to include rudamentary programatic control. For example, I need some method of saying things like repeat showing this resource for ten minutes, or until this event. In addition, if the user makes a selection then the cycle may have to abruptly stop and proceed to something else. Your know - browser stuff. I threw in a loop below: Is there some established means to do this? Thanks Rolande Still waiting for Jumbo2... xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri May 15 05:31:41 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:01:21 2004 Subject: WD of XSL requirements released References: <199805141908.MAA27070@boethius.eng.sun.com> Message-ID: <355BB6A2.353A@hiwaay.net> My favorite part: "Easy things should be easy." Amen. "Absolute/relative positioning, Layering, and Transparency ... Z-order for overlapping areas" Does this include creating 3D objects such as indexed face sets, graphic primitives, or only layering? "Animation: support for identifying and including encapsulated, animated objects." Loose. Traditionally, that is a client plugin. I understand the idea, but avi, VRML, gif89 are all animation and very different environments. Animation can be a hub (eg, VRML anchors) so, an example here would be appreciated. "Callouts: Linking hotspots to items in the flow of text. Another aspect would be manipulating presentation of text within a CGM graphic." Callouts? From the IETM community, I understand why you want this, but it is an application content type and should be defined in the DTD, not the style language although the style language specifies the rendering. The CGM example seems odd since that is a presentation language with text primitives. Why are you you are applying XSL style rules to CGM text types? Just curious. "Tables: Support for the table models of CSS and DSSSL. Ability to easily format popular source table models such as HTML and CALS." Vague. Is this interpreted that 1. XSL processors shall always process CSS and DSSSL table models to a feature set TBD 2. XSL processors may optionally support HTML and CALS table models "Data Types: Scalar types, units of measure, Flow Objects, XML Objects" What are XML Objects? "Scripting" Ok. This appears to limit XSL to transforms. If so, architecturally, how do scripting languages in the other categories relate to XSL? The "out of scope" requirements are hard to interpret in light of current scripting practice, eg, inline JavaScript. "Acessibility" What are audio stylesheets? What are "accessibility mechanisms"? "Persistent headers/footers w/scrolling body: ability to specify headers and/or footers that should be fixed at the borders of a scrolling text area. This would provide the functionality most commonly achieved with frames in HTML browsers today." Loose. Frames also encapsulate plugins which are active hubs. This feels like WinHelp, not an Internet browser, so I am wondering what "functionality most commonly achieved" is indicated, ie, just scrolling? **************************************** Overall, very good start. It appears to be a good sampling of all of the necessary techniques. Len Bullard Intergraph xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gotou at flab.fujitsu.co.jp Fri May 15 08:55:12 1998 From: gotou at flab.fujitsu.co.jp (Masatomo Goto) Date: Mon Jun 7 17:01:21 2004 Subject: Xlink semantics In-Reply-To: <3.0.1.16.19980514080952.514748b8@pop3.demon.co.uk> References: <199805140517.OAA28951@kmailserv.akashi.flab.fujitsu.co.jp> <3.0.1.16.19980513204547.5bdf6c46@pop3.demon.co.uk> Message-ID: <199805150649.PAA22430@kmailserv.akashi.flab.fujitsu.co.jp> At 08:09 98/05/14, Peter Murray-Rust wrote: > >I'm now working on design and implementation of a XLink facility. > >My approach is to provide the XLink facility as an engine. > > This is wonderful! From time to time I have suggested such an engine on > XML-DEV and it is great to see someone as well advanced as this. Do you > plan to make this available? If not, are you able to publish the semantics? I have no idea around these things NOW. But, I hope something can be available or published. > I have recently met a good fairy which means that I shall be at XML98 for > part of the time (probably Tuesday - Thursday). It would be nice to meet. I'm looking forward to meeting and having a discussion. Best regards. -Masatomo Goto --- Masatomo Goto Information Service Architecture lab. Fujitsu Laboratories Ltd. Tel: +81-78-934-8249 Fax: +81-78-934-3312 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From yosr.hmaied at inria.fr Fri May 15 08:59:44 1998 From: yosr.hmaied at inria.fr (Yosr HMAIED) Date: Mon Jun 7 17:01:21 2004 Subject: root entities Message-ID: <355BE7D2.1CCA@inria.fr> in XML 1.0 specification it's said that -- Yosr HMAIED mailto:yosr.hmaied@inria.fr tel :01 39 63 51 71 Projet Rodin INRIA Rocquencourt xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rajesh_n1 at verifone.com Fri May 15 09:20:29 1998 From: rajesh_n1 at verifone.com (Rajesh N) Date: Mon Jun 7 17:01:21 2004 Subject: test cases for xml parser & dom Message-ID: <3.0.5.32.19980515125132.007e1cf0@blr-nt-mail1.verifone.com> Hi all, can anyone give me a pointer to test cases for xml parser (including data validation) and the DOM API ? thanks in advance for any help, Rajesh N. -------------------------------------------------------------- There is no right way to do wrong. -------------------------------------------------------------- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Fri May 15 09:44:58 1998 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:01:21 2004 Subject: test cases for xml parser & dom In-Reply-To: <3.0.5.32.19980515125132.007e1cf0@blr-nt-mail1.verifone.com> References: <3.0.5.32.19980515125132.007e1cf0@blr-nt-mail1.verifone.com> Message-ID: * Rajesh N. | | can anyone give me a pointer to | test cases for xml parser (including data validation) James Tauber has a good list at (BTW: List crossposting is not a good idea. I'm following your example so people know your question has been answered.) -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From akitoshi.yoshida at sap-ag.de Fri May 15 13:42:44 1998 From: akitoshi.yoshida at sap-ag.de (Akitoshi Yoshida) Date: Mon Jun 7 17:01:21 2004 Subject: question: NodeIterator in DOM Message-ID: <199805151131.NAA14413@hs2114.wdf.sap-ag.de> Hi, I have a question on NodeIterator in DOM-19980416. Excerpt from the spec: If a node is inserted before the node just after the iterator position, it will be returned by getNextNode(); likewise if a node is inserted after the node just previous to the iterator position, it will be return by the next getPrevNode() call. The first part is clear. When a Node is inserted by insertBefore() at the node just before the iterator position, the iterator position is set in front of this new node. The second part starting "likewise.." is not clear to me. Are we assuming the existance of the insertAfter() method? It sounds like the insertion position (relative to the nodes) is the same as in the first case. If we only have the insertBefore() method in the Node inteface, how can we distinguish the above two cases? thanks in advance for any clarification aki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From David.Halls at cl.cam.ac.uk Fri May 15 13:51:47 1998 From: David.Halls at cl.cam.ac.uk (David Halls) Date: Mon Jun 7 17:01:21 2004 Subject: demo of client-side XML stylesheets in Standard ML available Message-ID: The first release of Persimmon's Standard ML to Java bytecode compiler, MLJ, was made on May 13th, 1998. Please see the posting in comp.lang.functional. We have since added a new on-line demo which demonstrates some of our work using Standard ML to write stylesheets that transform SGML and XML into HTML. MLJ allows us to compile our stylesheets into applets that run in browsers. We are able to send XML to browsers which is processed by our stylesheets. The generated HTML is displayed to users with no further network access back to the server. Please feel free to try the demo out - go to the MLJ home page (http://research.persimmon.co.uk/mlj/) and follow the link to on-line demos. Our stylesheet demo installs a sample stylesheet applet in your browser and downloads a sample XML document. The HTML generated by the stylesheet is then displayed. You will also be able to control the style of the generated HTML through the applet's user interface. Links are provided enabling you to view the Standard ML source of the stylesheet (for comparison with other stylesheet languages). You can also view the original SGML document, its DTD and the XML that is automatically generated from it. You'll also find some general links to documents describing the benefits of SGML, XML and stylesheets. Email: mlj@persimmon.co.uk mlj@persimmon.com Web: http://research.persimmon.co.uk/mlj/ http://www.persimmon.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From amitr at abinfosys.com Fri May 15 16:16:22 1998 From: amitr at abinfosys.com (Amit Rekhi) Date: Mon Jun 7 17:01:21 2004 Subject: XML and EDI Message-ID: <3.0.5.32.19980515190728.0097b600@192.168.1.1> Hello! > There is a special XML/EDI list itself for this purpose where you can get > a lot more information. > I dont think that XML would REPLACE EDI , in fact it very well COMPLIMENTS > EDI. > XML is enabling EDI on the Web in a very smooth way where in EDI data > could be exchanged in XML wrapped EDI File , the Transaction Sets / > Messages would be EDI compliant DTDs. The developement is thus helping EDI > to be done on the Web taking into account the work already put in to > desgign the EDI structure and the Messages. > I not very sure of the products available on XML\EDI as its quite into the > discussion stages . But its not very long when there will be products out > as XML seems to be extremely helpful for the cause. > > All the Best!! > Aditya > A.B.Infosys Pvt. Ltd. > New Delhi , India > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mcc at arbortext.com Fri May 15 16:58:03 1998 From: mcc at arbortext.com (Mike Champion) Date: Mon Jun 7 17:01:21 2004 Subject: question: NodeIterator in DOM In-Reply-To: <199805151131.NAA14413@hs2114.wdf.sap-ag.de> Message-ID: <98May15.105314edt.26883@thicket.arbortext.com> At 07:31 AM 5/15/98 -0400, Akitoshi Yoshida wrote: >Hi, >I have a question on NodeIterator in DOM-19980416. > >Excerpt from the spec: > > If a node is inserted before the node just after > the iterator position, it will be returned by getNextNode(); > likewise if a node is inserted after the node just previous > to the iterator position, it will be return by the next > getPrevNode() call. > >The first part is clear. When a Node is inserted by insertBefore() >at the node just before the iterator position, the iterator position >is set in front of this new node. The second part >starting "likewise.." is not clear to me. Are we assuming the >existance of the insertAfter() method? It sounds like the >insertion position (relative to the nodes) is the same as >in the first case. If we only have the >insertBefore() method in the Node inteface, >how can we distinguish the above two cases? It's better to post DOM questions on www-dom@w3.org. I'll answer here and cc: to ww-dom so please look there for any followup. But anyway, Good Point! We can't distinguish the two cases and I have no idea what I was thinking when I typed it. I'll remove the clause starting with "likewise" from the spec. It should work as follows. Consider an iterator through nodes A, B, C. The last node returned was B, so the "current position" signified by the "^" is after B. A --------B--^-----C If a new node X is inserted between B and C, the current position remains after B, so X will be returned by getNextNode(), and B returned by getPrevNode(). A --------B--^-----X--------C Sorry for the confusion, Mike Champion xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From akitoshi.yoshida at sap-ag.de Fri May 15 17:35:19 1998 From: akitoshi.yoshida at sap-ag.de (Akitoshi Yoshida) Date: Mon Jun 7 17:01:21 2004 Subject: question: NodeIterator in DOM Message-ID: <199805151519.RAA02352@hs2114.wdf.sap-ag.de> Hi, Another question regarding NodeIterator: You have an NodeIterator instance with the iterator position set somewhere in the middle. When you remove three nodes: first, the node just after the iterator position, second the node after this removed node, and finally the node just before the original iterator position, then what should toNextNode() and toPrevNode() return? in example: we have nodes A B C D E. the iterator is positioned just before C. remove C, D, and B. then we have A E and the iterator position should be positioned just before E. so toNextNode() and toPrevNode() should return E and A, respectively. NodeIterator may mark its current iterator position by an integer offset or by an object reference. but it must be notified for certain insert and remove operations to preserve the above semantics. Is it correct? Thanks aki ---- original message Hi, I have a question on NodeIterator in DOM-19980416. Excerpt from the spec: If a node is inserted before the node just after the iterator position, it will be returned by getNextNode(); likewise if a node is inserted after the node just previous to the iterator position, it will be return by the next getPrevNode() call. The first part is clear. When a Node is inserted by insertBefore() at the node just before the iterator position, the iterator position is set in front of this new node. The second part starting "likewise.." is not clear to me. Are we assuming the existance of the insertAfter() method? It sounds like the insertion position (relative to the nodes) is the same as in the first case. If we only have the insertBefore() method in the Node inteface, how can we distinguish the above two cases? thanks in advance for any clarification aki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mcc at arbortext.com Fri May 15 18:20:47 1998 From: mcc at arbortext.com (Mike Champion) Date: Mon Jun 7 17:01:22 2004 Subject: question: NodeIterator in DOM In-Reply-To: <199805151519.RAA02352@hs2114.wdf.sap-ag.de> Message-ID: <98May15.121627edt.26888@thicket.arbortext.com> At 11:19 AM 5/15/98 -0400, Akitoshi Yoshida wrote: >in example: >we have nodes A B C D E. >the iterator is positioned just before C. >remove C, D, and B. >then we have A E and the iterator position should be >positioned just before E. >so toNextNode() and toPrevNode() should return E and A, respectively. Yes. > >NodeIterator may mark its current iterator position by >an integer offset or by an object reference. but it must >be notified for certain insert and remove operations to >preserve the above semantics. Is it correct? I'm not sure. There's been a lot of discussion of how to implement NodeIterators on www-dom; see the archives at http://lists.w3.org/Archives/Public/www-dom/ I believe different designs exist, some of which require iterators to be notified on insert and delete operations, and others which maintain "markers" in the tree that move around as the tree is edited. I have not followed this closely enough to give you a definitive answer, but look at the www-dom archives, and especially Don Park's SAXDOM implementation, because Don participated in the discussions of this a couple of weeks ago and I believe his latest release reflects what he learned. Good luck, Mike xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elm at arbortext.com Fri May 15 18:22:37 1998 From: elm at arbortext.com (Eve L. Maler) Date: Mon Jun 7 17:01:22 2004 Subject: Xlink semantics In-Reply-To: <98May14.012336edt.26885@thicket.arbortext.com> References: <3.0.1.16.19980513204547.5bdf6c46@pop3.demon.co.uk> Message-ID: <3.0.5.32.19980515120249.00a257b0@village.promanage-inc.com> This is a really good issue. Obviously, the interactions among the XLink-aware elements are underspecified! I agree with Masatomo Goto's interpretation of the various element configurations. A few other comments below... At 01:20 AM 5/14/98 -0400, Masatomo Goto wrote: > >Hello, > >I'm now working on design and implementation of a XLink facility. >My approach is to provide the XLink facility as an engine. (This is wonderful!) >At 20:45 98/05/13, Peter Murray-Rust wrote: >> I have been hacking an application (the VHG DTD) using Xlink and I'd like >> to check on some semantics. Since the spec has contentSpecs of ANY for all >> three link-types the formal situation is that anything is permitted. Should >> my software complain at the following examples (notation should be >> obvious), or should it try to do something clever? > >In my Xlink engine, these examples are recognized as follows: > >> >> >> >> >> > > - One extended link which has no or only inline resource > (depends on inline attribute value) > - Two simple links which are in the extended link content > >> >> >> >> >> >> > > - One extended link which has two or three(inline) resources. > - One simple link which is in the extended link content > >> ... >> >> >> >> > > - One extended link which has one or two (inline) resources. > >> ... >> >> >> >> >> >> >> >> >> >> >> > > - One extended link which has no or only inline resources > - two extended links which have two or three (inline) resources > in the parent extended link's content > >> ... >> >> >> >> >> >> ... > > - no meaning. no processing. > > >> The problem is that all of these throw no error in the parser as they are >> probably impossible to constrain except in very spartan DTDs. I suspect >> most are not productive, but some might be valuable on occasions I have or >> haven't thought of. > >If the XLink processing facilities are separated from the application, >It is possible to throw some errors from the "XLink processor". This is how I envision linking support. Each XML-related specification suggests the creation of a "processor" for that level, with any other "applications" overlaying it. Of course, if you do encapsulate the relevant XLink awareness into your DTD, you will get some validation "for free." >> This is an important occasion that there is a clear requirement for >> applications to apply semantics to parts of one of the specs. We already >> have to write an attribute processor and I'm interested in knowing how much >> additional processing any conforming Xlink software is going to have to do. > >FYI, I will give a speach and demonstration about my XLink engine in >the HyTime at work session of SGML/XML Europe '98. I look forward to seeing it! Eve xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Fri May 15 23:37:44 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:01:22 2004 Subject: A little wish for short end tags In-Reply-To: (message from Toby Speight on 14 May 1998 12:15:26 +0100) Message-ID: <199805152135.OAA27215@boethius.eng.sun.com> [Toby Speight:] | But there are plenty of (non-parsing) applications that benefit from | XML standard end-tags. An obvious one is selection of an element from | a document; a regexp search for the start-tag, and then just match | start and end tags *for that element type*, keeping track of depth | *for that element type* (we don't even need to do that if the element | type is known not to be nestable in itself). That application need | not even notice tags for other element types. This is precisely the scenario that I had in mind when I invented the figure of the Desperate Perl Hacker -- someone who has no idea how to build a parser but can do very powerful operations on large quantities of XML using simple pattern matches if the presence of full end-tags is guaranteed. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From greyno at mcs.com Sat May 16 06:23:09 1998 From: greyno at mcs.com (Gregg Reynolds) Date: Mon Jun 7 17:01:22 2004 Subject: A little wish for short end tags References: <199805152135.OAA27215@boethius.eng.sun.com> Message-ID: <355D0208.3B00@mcs.com> Jon Bosak wrote: > > [Toby Speight:] > > | But there are plenty of (non-parsing) applications that benefit from > | XML standard end-tags. > This is precisely the scenario that I had in mind when I invented the > figure of the Desperate Perl Hacker -- someone who has no idea how to > build a parser but can do very powerful operations on large quantities > of XML using simple pattern matches if the presence of full end-tags > is guaranteed. > Given: 1. Short tags 2. Some non-trivial number of docs marked up with short-tags 3. Some non-trivial number of DPH's desperate to hack at these docs; Isn't it likely that some non-trivial number of XML normalizers will become at least as widespread as perl? Thereby relieving our lonely hackers of some non-trivial measure of their desperation? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Sat May 16 07:21:03 1998 From: jtauber at jtauber.com (James K. Tauber) Date: Mon Jun 7 17:01:22 2004 Subject: A little wish for short end tags In-Reply-To: <199805152135.OAA27215@boethius.eng.sun.com> Message-ID: <000a01bd808a$142122c0$be6118cb@caleb> > This is precisely the scenario that I had in mind when I invented the > figure of the Desperate Perl Hacker -- someone who has no idea how to > build a parser but can do very powerful operations on large quantities > of XML using simple pattern matches if the presence of full end-tags > is guaranteed. I still think life would have been easier for the DPH if > had had the same prohibitions as <. James -- James Tauber / jtauber@jtauber.com Perth, Western Australia XML Pages: http://www.jtauber.com/xml/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Sat May 16 07:30:46 1998 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:01:22 2004 Subject: A little wish for short end tags Message-ID: <01bd808c$7c07baa0$71431ecc@uspppBckman> It is so easy to use script and code to isolate element text when the full end tag is included, and it is really difficult (though possible) to do this when 'short' endtags are employed. To my way of thinking this fact alone is enough to justify the survival and existance of the full end tag. Frank -----Original Message----- From: Frank Boumphrey To: Gregg Reynolds Date: Saturday, May 16, 1998 1:32 AM Subject: Re: A little wish for short end tags >It is so easy to use script and code to isolate element text when the full >end tag is included, and it is really difficult (though possible) to do this >when 'short' endtags are employed. > >To my way of thinking this fact alone is enough to justify the survival and >existance of the full end tag. > >Frank > >-----Original Message----- >From: Gregg Reynolds >To: xml-dev@ic.ac.uk >Date: Saturday, May 16, 1998 12:35 AM >Subject: Re: A little wish for short end tags > > >>Jon Bosak wrote: >>> >>> [Toby Speight:] >>> >>> | But there are plenty of (non-parsing) applications that benefit from >>> | XML standard end-tags. >>> >> This is precisely the scenario that I had in mind when I invented the >>> figure of the Desperate Perl Hacker -- someone who has no idea how to >>> build a parser but can do very powerful operations on large quantities >>> of XML using simple pattern matches if the presence of full end-tags >>> is guaranteed. >>> >> >>Given: >> 1. Short tags >> 2. Some non-trivial number of docs marked up with short-tags >> 3. Some non-trivial number of DPH's desperate to hack at these docs; >> >>Isn't it likely that some non-trivial number of XML normalizers will >>become at least as widespread as perl? Thereby relieving our lonely >>hackers of some non-trivial measure of their desperation? >> >> >>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >>(un)subscribe xml-dev >>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following >message; >>subscribe xml-dev-digest >>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) >> >> > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lisarein at finetuning.com Sat May 16 07:36:32 1998 From: lisarein at finetuning.com (Lisa Rein) Date: Mon Jun 7 17:01:22 2004 Subject: A little wish for short end tags References: <01bd808c$7c07baa0$71431ecc@uspppBckman> Message-ID: <355D2BFB.C337803C@finetuning.com> LONG LIVE THE FULL END TAG! lisa Frank Boumphrey wrote: > > It is so easy to use script and code to isolate element text when the full > end tag is included, and it is really difficult (though possible) to do this > when 'short' endtags are employed. > > To my way of thinking this fact alone is enough to justify the survival and > existance of the full end tag. > > Frank > > -----Original Message----- > From: Frank Boumphrey > To: Gregg Reynolds > Date: Saturday, May 16, 1998 1:32 AM > Subject: Re: A little wish for short end tags > > >It is so easy to use script and code to isolate element text when the full > >end tag is included, and it is really difficult (though possible) to do > this > >when 'short' endtags are employed. > > > >To my way of thinking this fact alone is enough to justify the survival and > >existance of the full end tag. > > > >Frank > > > >-----Original Message----- > >From: Gregg Reynolds > >To: xml-dev@ic.ac.uk > >Date: Saturday, May 16, 1998 12:35 AM > >Subject: Re: A little wish for short end tags > > > > > >>Jon Bosak wrote: > >>> > >>> [Toby Speight:] > >>> > >>> | But there are plenty of (non-parsing) applications that benefit from > >>> | XML standard end-tags. > >>> > >> This is precisely the scenario that I had in mind when I invented the > >>> figure of the Desperate Perl Hacker -- someone who has no idea how to > >>> build a parser but can do very powerful operations on large quantities > >>> of XML using simple pattern matches if the presence of full end-tags > >>> is guaranteed. > >>> > >> > >>Given: > >> 1. Short tags > >> 2. Some non-trivial number of docs marked up with short-tags > >> 3. Some non-trivial number of DPH's desperate to hack at these docs; > >> > >>Isn't it likely that some non-trivial number of XML normalizers will > >>become at least as widespread as perl? Thereby relieving our lonely > >>hackers of some non-trivial measure of their desperation? > >> > >> > >>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > >>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > >>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > >>(un)subscribe xml-dev > >>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > >message; > >>subscribe xml-dev-digest > >>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > >> > >> > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From liamquin at interlog.com Sat May 16 10:04:19 1998 From: liamquin at interlog.com (Liam Quin) Date: Mon Jun 7 17:01:22 2004 Subject: A little wish for short end tags In-Reply-To: <355D0208.3B00@mcs.com> Message-ID: On Fri, 15 May 1998, Gregg Reynolds wrote: > Given: > 1. Short tags > 2. Some non-trivial number of docs marked up with short-tags > 3. Some non-trivial number of DPH's desperate to hack at these docs; > > Isn't it likely that some non-trivial number of XML normalizers will > become at least as widespread as perl? I wish people would stop this. There are no minimisation features in XML. This is a major reason why it is succeeding. Yes, perl will include an XML parser. None the less, there are a lot of good reasons for keeping the full end tags, and these have been discussed carefully and at great length. Handling requires keeping a stack and processing the entire document from the beginning. There are no constraints on the possible depth of the stack. Yes, you can keep a stack in perl. Yes, you can read the document into memory too, if you have enough memory. <> * Why stop there? * this is a perfectly valid SGML bullet list, with the asterisks getting replaced by ITEM tags automatically, and OMITTAG filling in the end tags. * note that you can map just about any character to a tag, except an upper case B, which cannot be used or escaped from its special meaning. * with the RANK feature, you can save a few more bytes, and it's often fairly straight forward to add once you have all the other features. * and with LINK, you can have attributes added automatically, so you don't need to put them in the DTD or the document directly. In other words, every SGML feature has uses, and if XML had them all, it would no longer be a subset suitable for widespread use. There is already software to mormalise null end tags. It didn't make SGML catch on. "Full SGML" is too complex. It is widely used, but XML looks like it will be used massively more widely than its parent. Stop trying to add complexities. XML 1.0 is published. Use it. Lee -- Liam Quin -- the barefoot typographer -- Toronto lq-text: freely available Unix text retrieval IRC: discuss XML/SGML/XSL/XLL/DSSSL Mondays irc.technonet.net in #XML email address: l i a m q u i n, at host: i n t e r l o g dot c o m xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sat May 16 10:19:26 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:01:23 2004 Subject: XML test suite updated Message-ID: <355D4A53.1D8A1252@jclark.com> I've updated my collection of XML test cases (ftp://ftp.jclark.com/pub/xml/xmltest.zip). I've reorganized it a little and added some more test cases. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Patrice.Bonhomme at loria.fr Sat May 16 14:32:57 1998 From: Patrice.Bonhomme at loria.fr (Patrice Bonhomme) Date: Mon Jun 7 17:01:23 2004 Subject: A Content Model Question ? Message-ID: <199805161231.OAA10248@chimay.loria.fr> Is the Content Model for the Element D valid in XML 1.0 ? ]> Thanks. Pat. -- ============================================================== bonhomme@loria.fr | Office : B.228 http://www.loria.fr/~bonhomme | Phone : 03 83 59 30 52 -------------------------------------------------------------- * Serveur Silfide : http://www.loria.fr/projets/Silfide * Projet Aquarelle : http://aqua.inria.fr ============================================================== xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Sat May 16 15:38:38 1998 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:01:23 2004 Subject: A Content Model Question ? In-Reply-To: <199805161231.OAA10248@chimay.loria.fr> References: <199805161231.OAA10248@chimay.loria.fr> Message-ID: * Patrice Bonhomme | | Is the Content Model for the Element D valid in XML 1.0 ? | | No. You must write it like this: See section 3.2.2 of the spec, production 51. -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sat May 16 16:34:26 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:01:23 2004 Subject: A little wish for short end tags References: Message-ID: <355DA41C.E99FF4AE@technologist.com> I had hoped to allow this thread to die, but I can't allow this incorrect statement to pass. These are the sorts of things that become dogma: Liam Quin wrote: > > There are no minimisation features in XML. 1. Empty-element tags (XML actually ADDED this to SGML) 2. Doctype declarations of the form Plus there are many alternative representations geared totally toward usability: 3. Predefined entities 4. Alternate literal quoting characters 5. Alternate unicode number syntax (XML actually ADDED this to SGML) As long as I've already put myself into disrepute by perpetuating the thread: Another myth that I see floating around often is that XML is simple. Anybody who believes that has not studied the various types of entities, their allowed occurrences, order of replacement and interaction with the standalone declaration. Essentially these features cannot be expressed in prose text that could be read and understood by a typical reader. That means that XML's syntax is more complicated than most programming languages which *can* be (and sometimes are!) described completely in prose text. I have never looked at the grammars for Python or Java, for example, but I have a pretty clear idea of what is legal and what is not. Of course, XML's central concepts are simple, just as SGML's were. But neither language is syntactically simple. Compared to the complexity of these features short end tags would make the specification essentially no more complex. You would add a single question mark to the EBNF as opposed to hundreds of percent signs for parameter entities. The "keep XML simple" argument is a non-starter. "Minimization is a slippery slope" is also a non-starter. We've already got minimization and any move towards more is strongly resisted (which is good...we always need some people to argue against features). The XML working group is full of people who have the ability to make decisions on a case by case basis. I think it is an insult to them to propose otherwise. The logical end-point of the slippery slope argument is "If we try to make a subset of SGML, we'll start adding SGML features until we end up with SGML" which obviously did not happen. "Full end tags help hackers" is a completely valid point. It is the central point. It is the point that forced the decision in the first place. Personally, I think that it is quite easy to type: expandTags myFile.sgm | awk .... But I recognize that others disagree. It is arguably the case that downloading and compiling "expandTags" is an unacceptable burden on Desperate Perl Hackers. Paul Prescod - http://itrc.uwaterloo.ca/~papresco Can we afford to feed that army, while so many children are naked and hungry? Can we afford to remain passive, while that soldier-army is growing so massive? - "Gabby" Barbadian Calpysonian in "Boots" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sat May 16 16:51:47 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:01:23 2004 Subject: Correction: Re: A little wish for short end tags References: <355DA41C.E99FF4AE@technologist.com> Message-ID: <355DA818.513C141E@technologist.com> Paul Prescod wrote: > > You would add a single question mark to the EBNF as opposed > to hundreds of percent signs for parameter entities. Correction: I forgot that the final draft uses a much more elegant rule for parameter entities. They are still much more complicated to understand and use than short end-tags, but there is not as much spec-space dedicated to them as there once was. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "A writer is also a citizen, a political animal, whether he likes it or not. But I do not accept that a writer has a greater obligation to society than a musician or a mason or a teacher. Everyone has a citizen's commitment." - Wole Soyinka, Africa's first Nobel Laureate xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Sat May 16 18:52:05 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:01:23 2004 Subject: A little wish for short end tags In-Reply-To: <355D0208.3B00@mcs.com> (message from Gregg Reynolds on Fri, 15 May 1998 23:03:36 -0400) Message-ID: <199805161649.JAA27414@boethius.eng.sun.com> [Gregg Reynolds:] | Given: | 1. Short tags | 2. Some non-trivial number of docs marked up with short-tags | 3. Some non-trivial number of DPH's desperate to hack at these docs; | | Isn't it likely that some non-trivial number of XML normalizers will | become at least as widespread as perl? Thereby relieving our lonely | hackers of some non-trivial measure of their desperation? Nothing can match the brute simplicity of a one-line perl regexp operating over unlimited amounts of data within a ksh or bash command-line loop. People with a programming background tend to find this hard to understand, but there are a lot of folks out there in publishing and everyday business management who know exactly what I'm talking about. A perl regexp is the *upper bound* of sophistication for this constituency. Please try, if you can, to imagine being faced with the job of doing an element-specific mass search-and-replace over two years' worth of company reports when all you know about XML is what you can see by looking at the source, you've never heard of the concept of a normalizer, and the only scripting tool you know how to use is the Word or WordPerfect macro language. You may never find yourself in this position, but there are hundreds of thousands of ordinary users who aren't going to be so lucky. This is one of the reasons that many corporate SGML users made it a policy years ago to normalize all SGML documents to expand the end tags and why most SGML editors do this automatically every time a file is saved. SGML gives you the option of using empty end tags, and the historical fact is that most large users, given this option and a sufficient amount of experience with it, choose not to use it. XML simply enforces what many people faced with the management of large amounts of tagged text adopted as good practice a long time ago and provides the same guarantee of safe tagging across organizations that has generally existed within them. If you really like using shortcuts, then go all the way: get a genuine SGML tool and define a DTD that allows not just end-tag minimization but full omission of both start-tags *and* end-tags. Knock yourself out. Just make sure to normalize the result before you call it XML and ship it out to the rest of us to work with. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sat May 16 21:58:32 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:01:23 2004 Subject: A little wish for short end tags References: <199805161649.JAA27414@boethius.eng.sun.com> Message-ID: <355DF024.FF2557F2@technologist.com> Jon Bosak wrote: > > A perl regexp is the *upper bound* of sophistication for this > constituency. Please try, if you can, to imagine being faced with the > job of doing an element-specific mass search-and-replace over two > years' worth of company reports when all you know about XML is what > you can see by looking at the source, you've never heard of the > concept of a normalizer, and the only scripting tool you know how to > use is the Word or WordPerfect macro language. I do not believe that a person with the knowledge level you have described is going to succeed at the task you have set for him or her. Entities are going to kill them. Whitespace in end-tags is going to toast them. CDATA sections are going to confuse them. Elements (and tags!) broken across lines are going to destroy them. This person can only succeed if a) the data is already normalized, probably due to a corporate standard such as the one you mention. b) they download a normalizer. If I am wrong, it would be easy to prove me so. All someone has to do is provide a regular expression that can (for instance) change all occurrences of the GI "FOO" into "BAR" in any XML document corresponding to a DTD of their choice (but which I can extend in the internal subset). On the other hand, I can do this *trivially* in a regular expression on data that has been normalized. > SGML gives you the option of using empty end tags, and the > historical fact is that most large users, given this option and a > sufficient amount of experience with it, choose not to use it. These "large users" have expensive SGML editors that they have paid someone thousands of dollars to customize to perfection. Under those conditions, I would legislate redundancy also -- not just fully expanded end-tags, but probably redundant IDs in comments of end-tags, public identifiers on all entity declarations, perhaps even unique identifiers on all elements. But XML is about a different world than that. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "A writer is also a citizen, a political animal, whether he likes it or not. But I do not accept that a writer has a greater obligation to society than a musician or a mason or a teacher. Everyone has a citizen's commitment." - Wole Soyinka, Africa's first Nobel Laureate xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From schampeo at hesketh.com Sat May 16 22:16:16 1998 From: schampeo at hesketh.com (Steven Champeon) Date: Mon Jun 7 17:01:23 2004 Subject: A little wish for short end tags In-Reply-To: <355DF024.FF2557F2@technologist.com> References: <199805161649.JAA27414@boethius.eng.sun.com> Message-ID: <3.0.5.32.19980516161020.00995ea0@wasabi.hesketh.com> At 03:59 PM 5/16/98 -0400, Paul Prescod graced us with: > I do not believe that a person with the knowledge level you have described > is going to succeed at the task you have set for him or her. The tasks and knowledge level Jon describes are exactly what I had when I first started doing SGML back in 1993. I knew a bit about what the specific tagsets I was using could contain, I'd picked up the regex search/replace then supplied with my editor, SoftQuad's Author/Editor, and I'd started to maintain a few Perl scripts which were designed to convert text files (the output from our proprietary workflow conversion system) into SGML. It wasn't uncommon for me to use the following sort of ugliness, which I inherited from a programmer before me: $/=""; $*=1; # slurp mode while(<>) { s/$absurdly_long_regular_expression/what_ought_to_go_there/g; # ... repeat one per line for fifty or more regexes } I had regexes which were longer than pico could handle, so I used Sun textedit, which wrapped them onscreen. Eventually, I learned emacs and vi and discovered the pure joy of the UNIX command line, and picked up a few more Perl tricks, like formatted code, comments, and not using absurdly long regular expressions in a multiple-pass global search and replace in order to mark up incomplete text files. ;) I saved myself countless hours of manual tagging in A/E this way. It'd make your hair stand on end to see some of the scripts I used in my daily work, but it made my life easier and provided a break from the tedium of hand-tagging, and was also a challenge. Try to remember that not everyone has time or the background to absorb intricacies, and that perfection falls a far second after getting the job done in almost every context outside the realm of pure thought. S -- "All the good geek things, schampeo@hesketh.com only without all the http://a.jaundicedeye.com bad geek things." http://hesketh.com/schampeo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sun May 17 04:20:56 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:01:23 2004 Subject: A little wish for short end tags References: <199805161649.JAA27414@boethius.eng.sun.com> <3.0.5.32.19980516161020.00995ea0@wasabi.hesketh.com> Message-ID: <355E49BE.CB166F4A@technologist.com> Steven Champeon wrote: > > The tasks and knowledge level Jon describes are exactly what I had when I > first started doing SGML back in 1993. So what you're saying is that this sort of thing can work, even with all of the minimization features of SGML, because you knew the general layout of your data and/or you had a tool that could normalize the weird stuff for you. You succeeded at your task because you approached it (perhaps unknowingly) with either: a) the right data set: more or less already normalized SGML or b) the right tool -- a normalizer: AE. That's exactly what I've been saying also. If you are going to do regular expression hacking on XML it had better have been already marked-up in some corporate standard (which would probably exclude short end-tags, confusing entities, confusing whitespace, confusing newlines, etc.) or you should have a tool that can normalize it for you. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "A writer is also a citizen, a political animal, whether he likes it or not. But I do not accept that a writer has a greater obligation to society than a musician or a mason or a teacher. Everyone has a citizen's commitment." - Wole Soyinka, Africa's first Nobel Laureate xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sun May 17 10:11:15 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:01:24 2004 Subject: Clarification: (was: A little wish...) References: <355DA41C.E99FF4AE@technologist.com> Message-ID: <355E9BEB.2206DDFA@technologist.com> Paul Prescod wrote: > > good...we always need some people to argue against features). The XML > working group is full of people who have the ability to make decisions on > a case by case basis. I think it is an insult to them to propose > otherwise. I did not mean that Liam intended to insult the working group. After all, Eliot Kimber made the same argument, and he is on the group. It just seems to me that it is an inadvertant insult, as if I said to a friend: "You better not start eating those shrip, you'll never be able to stop." If they are not a chronic over-eater or shrimp-addict, then it would be strange to suggest that in *this one case* they are likely to not know when they are full. My friend would probably come to the conclusion that I thought that he did not know what was best for himself. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "A writer is also a citizen, a political animal, whether he likes it or not. But I do not accept that a writer has a greater obligation to society than a musician or a mason or a teacher. Everyone has a citizen's commitment." - Wole Soyinka, Africa's first Nobel Laureate xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Sun May 17 10:39:46 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:01:24 2004 Subject: A little wish for short end tags References: <199805161649.JAA27414@boethius.eng.sun.com> Message-ID: <355EA1E9.55A9@hiwaay.net> Jon Bosak wrote: > > SGML gives you the option of using empty end tags, and the > historical fact is that most large users, given this option and a > sufficient amount of experience with it, choose not to use it. Jon is right. I found it easier to read when the SGML instances were large and the designer used a lot of content types in deeper hierachies than one gets used to with HTML. For DTDs like the MIL Content Data Model, or the IADS DTDs where the indexes are named types mapped to stylesheet tables, one often wants to reach in and grab pieces of the tree easily. a good editor helps, but a lot of people won't have what they really need unless some serious price reductions happen in that market. Moving elements with highlighting is easier with the endtags, ihmo, too, because when one hits the end of a deep branch, a lot of grouped short end tags are hard to count by eye. sgml: people always say no one will edit it by hand, and almost everyone does. len bullard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Philippe.Le_Hegaret at sophia.inria.fr Sun May 17 18:32:26 1998 From: Philippe.Le_Hegaret at sophia.inria.fr (Philippe Le H�garet) Date: Mon Jun 7 17:01:24 2004 Subject: SAX 1.0 : handleData and processingInstruction Message-ID: <355F110F.E39AD593@sophia.inria.fr> charecters takes an array of characters for the data : public abstract void characters(char ch[], int start, int length) throws SAXException processingInstruction takes a String for the data : public abstract void processingInstruction(String target, String data) throws SAXException Why is there this difference in SAX ? Philippe. --------- Philippe Le Hegaret Philippe.Le_Hegaret@sophia.inria.fr -- http://www.inria.fr/koala/plh/ KOALA/DYADE/BULL @ INRIA (Stagiaire) - Sophia Antipolis xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From schampeo at hesketh.com Sun May 17 18:38:45 1998 From: schampeo at hesketh.com (Steven Champeon) Date: Mon Jun 7 17:01:24 2004 Subject: A little wish for short end tags In-Reply-To: <355E49BE.CB166F4A@technologist.com> References: <199805161649.JAA27414@boethius.eng.sun.com> <3.0.5.32.19980516161020.00995ea0@wasabi.hesketh.com> Message-ID: <3.0.5.32.19980517123239.00afcde0@wasabi.hesketh.com> At 10:21 PM 5/16/98 -0400, Paul Prescod graced us with: > So what you're saying is that this sort of thing can work, even with all > of the minimization features of SGML, because you knew the general layout > of your data and/or you had a tool that could normalize the weird stuff > for you. You succeeded at your task because you approached it (perhaps > unknowingly) with either: > > a) the right data set: more or less already normalized SGML or > b) the right tool -- a normalizer: AE. Um, no. I'm sorry - you didn't read what I said. I didn't say that I had been running my SGML through A/E, I had been running Perl scripts on text files that were output from a conversion system whose input was irregular at best - "this page intentionally left blank", typos, and so forth were the order of the day. This previous sentence may be translated as "the input was not normalized". Or did I not make this abundantly clear? If so, my apologies. I should have used terms more appropriate to your dialog, like "non-normalized", "poorly ordered data sets" or some such nonsense - which, if I understand Jon Bosak correctly, the average numbnut with Perl and a job to do wouldn't know. I never thought I'd ever show anyone this crud, but in the interests of keeping XML accessible, here's an extremely horrid script I wrote, based on other horrid scripts I inherited. I knew nothing about Perl at the time, having only a passing understanding of the following while() construct, and my only real useful knowledge was of regular expressions, which I'd taught myself using A/E's search/replace function. The script below fixes a number of problems with the output from another filter, whose input was the text output from a proprietary conversion system. Illustrated, Garbage on Paper -> Garbage in Electronic Format -> ASCII Trash -> SGML This script was the last in the pipeline. The references to 'paragrapgh' and 'loping' below are jokes, based on a misspelling in the original script I rec'd from the programmer :) #!/aie/newgateway/tools/perl # enable paragrapgh mode, multiline mode $/ = ""; $* =1; # get rid of hard returns, spaces before and after tags # add an "a" to board no. Take out endpara-beginpara in # emphasis tagset. gets deg& right. gets rid of empty tagsets. # puts a space after end sup/subscrpt and end emphasis tags. # joins seqlists that ought to be joined. fixes numstyle attrib. # rejoins paras broken by page or hyphen. # substitutes para0 title tags for emphasis tags found in paras. # puts figures on the outside of notes, warnings, and cautions. # Attempts to fix the seqlist nesting problem. # gets nested emphasis tags to titles of para0s. # start loping while (<>) { # remove para tags from within emphasis tags. s:(.*)\n:\1\n:g; # join hyphenated lines, take out hardreturns, replace multispaces with one space. # s:-\n::g; s/\n+/ /g; s/( )+/\1/g; # removes spaces after and before tags. s/> />/g; s/ ::g; s:::g; s:::g; s:::g; s:::g; s:::g; s:::g; # adds space after emphasis and s-script tags. s:():\1 :g; s:(): \1:g; s:():\1 :g; # gets the entities right. s:°ree;:°:g; s:&ree;:&:g; s:@reg;:®:g; # joins seqlists broken by paras. # s:::g; # dehyphenates mistakenly broken paras. s:-?([a-z]):\1:g; # puts seqlists inside the previous para. s#(:|.)[\032]*()#\1\2#g; # turns emphasis tags just inside paras into titles of the previous para0. # works on nested emphasis tags as well, to two levels. s#([0-z \-/]*)\. #\2.#g; s#([0-z \-/]*)\. #\2.#g; # puts in numstyle attrib. in seqlists. if(/<\/ITEM>[^<>\/]*<\/PARA><\/ITEM>[^<>/]*)<\/NOTE>/) { s:([^<>/]*)(
[^<>/]*
)+():\1\3\2:g; } if(/<\/FIGURE><\/CAUTION>/) { s:([^<>/]*)(
[^<>/]*
)+():\1\3\2:g; } if(/<\/FIGURE><\/WARNING>/) { s:([^<>/]*)(
[^<>/]*
)+():\1\3\2:g; } # puts untagged FIGURE inside tags. if(/FIGURE [0-9 \-\.]* [^<>]*<\/PARA>/) { s:FIGURE ([0-z\-\.]*\.) ?([^<>]*):
\2
:g; } # check for FIGURES with part of the label inside the title. if(/
[0-9\.]+ /) { s:(<FIGURE LABEL=")([^"]*)("><TITLE>)([0-9\.]+) :\1\2\4\3:g; } # puts cautions, warnings, notes inside the previous item tag. if(/<\/PARA><\/ITEM><\/SEQLIST><\/PARA><NOTE>/) { s:(</ITEM></SEQLIST></PARA>)(<NOTE><PARA>[^<>]*</PARA></NOTE>):\2\1:g; } if(/<\/PARA><\/ITEM><\/SEQLIST><\/PARA><WARNING>/) { s:(</ITEM></SEQLIST></PARA>)(<WARNING><PARA>[^<>]*</PARA></WARNING>):\2\1:g; } if(/<\/PARA><\/ITEM><\/SEQLIST><\/PARA><CAUTION>/) { s:(</ITEM></SEQLIST></PARA>)(<CAUTION><PARA>[^<>]*</PARA></CAUTION>):\2\1:g; } if(/"><PARA>[A-Z \-\/\.]*\. /) { s:(">)(<PARA>)([A-Z \-\/\.]*\.) :\1<TITLE>\3\2:g; } # joins seqlists that were created by bad tagging of cautions, warnings, and notes. if(/<\/CAUTION><\/ITEM><\/SEQLIST><\/PARA>):\1:g; } if(/<\/NOTE><\/ITEM><\/SEQLIST><\/PARA>):\1:g; } if(/<\/WARNING><\/ITEM><\/SEQLIST><\/PARA>):\1:g; } # fix erring para0s (figures, lb-in.) s:(lb in.):\1 \2:g; s:(lb in.):\1 \2:g; s:(lb in.):\1 \2:g; s:(lb in.):\1 \2:g; s:figure
:figureS \1:g; s:figure
:figureS \1:g; s:figure:figureS \1:g; s:figure:figureS \1:g; s:figure:figureS \1:g; # change the dtd header from docgasturb to dcgastep -- for meter books s:\[:[]>:; s:([^YH])*::; print; } The bitch of it is, this script worked fairly well and saved us enormous amounts of time. I hope that this ugliness demonstrates that Perl and SGML-like text files can allow a complete neophyte to do wonders, and that any hifalutin changes to the relative simplicity of the current XML spec would be detrimental to the average 'frustrated perl programmer'. Steve (I'm so ashamed) -- "All the good geek things, schampeo@hesketh.com only without all the http://a.jaundicedeye.com bad geek things." http://hesketh.com/schampeo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sun May 17 21:39:29 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:01:24 2004 Subject: A little wish for short end tags References: <199805161649.JAA27414@boethius.eng.sun.com> <3.0.5.32.19980516161020.00995ea0@wasabi.hesketh.com> <3.0.5.32.19980517123239.00afcde0@wasabi.hesketh.com> Message-ID: <355F3D36.AADDE804@technologist.com> Sorry, Steven. I still don't understand. The output of the scripts was irregular, but it didn't use any of SGML's hundreds of "hard" features like entities, whitespace in tags etc. The data may not have been normalized in any formal sense, but it used a predictable set of SGML's easier features. This is "good enough" to allow processing with regexps and basically partial normalization. If the data had used entities, whitespace in tags, comments etc., then I would say it was completely unnormalized. But anyhow, I don't understand how you can point to your *success* at taming a system built around a language with hundreds of minimizations as proof that one of those minimizations should not be allowed in another language. Had you failed, solely because of short end-tags, that would have been a persuasive argument. My belief is that that would not happen, because there are so many other factors. Most data falls either into the category of predictable and simple (which yours seems to have) or unpredictable and requiring normalization (which most data authored by people with text editors will look like -- short end-tags or not). Paul Prescod - http://itrc.uwaterloo.ca/~papresco "A writer is also a citizen, a political animal, whether he likes it or not. But I do not accept that a writer has a greater obligation to society than a musician or a mason or a teacher. Everyone has a citizen's commitment." - Wole Soyinka, Africa's first Nobel Laureate xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Mon May 18 12:58:20 1998 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 17:01:24 2004 Subject: A little wish for short end tags Message-ID: <199805181058.LAA12969@GPO.iol.ie> [Jon Bosak] >If you really like using shortcuts, then go all the way: get a genuine >SGML tool and define a DTD that allows not just end-tag minimization >but full omission of both start-tags *and* end-tags. Knock yourself >out. Just make sure to normalize the result before you call it XML >and ship it out to the rest of us to work with. I just had to reply and express my wholehearted aggreement with Jon's posting. SGML and SGML power tools can make excellent XML production systems. James Clarks SGML to XML tool - SX - for example is built on top of the core SP library and thus you get basically fully blown SGML power. All the minimization power you can shake a stick at. Sean Sean Mc Grath http://www.digitome.com/sean.htm County Sligo, Ireland, Tel: +353 96 47391 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Mon May 18 14:32:45 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:01:24 2004 Subject: question: NodeIterator in DOM References: <199805151519.RAA02352@hs2114.wdf.sap-ag.de> Message-ID: <35602AC2.60993E3A@infinet.com> Akitoshi Yoshida wrote: > Hi, > Another question regarding NodeIterator: > You have an NodeIterator instance with the iterator > position set somewhere in the middle. When you > remove three nodes: first, the node just after the > iterator position, second the node after this removed > node, and finally the node just before > the original iterator position, > then what should toNextNode() and toPrevNode() return? Essentially a similiar problem occurs in the JDK 1.2 Collection classes. What JavaSoft suggests is that for all Iterators (a new interface in the JDK 1.2 Collection classes) you synchronize on the object being iterated so that no internal changes to the object's state are made while iterating. Since DOM is supposed to be architectural and language neutral, languages which lack synchronization support may find some difficulty in making sure that an object which returns an Iterator object is not mutated while iteration is occurring in a multithreaded environment. If your application is single threaded, you should not have to worry about your object (such as a List) that returns the Iterator object being mutated except for programming error where you add and remove objects to the object returning the Iterator object while iterating. Sorry to be so convoluted here but I got 3 hours of sleep last night so I hope the above made sense. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Amitr at abinfosys.com Mon May 18 16:42:46 1998 From: Amitr at abinfosys.com (Amit) Date: Mon Jun 7 17:01:24 2004 Subject: Loading an External DTD for an XML using MSXML Message-ID: <9805181515.AB25790@del2.vsnl.net.in> Hello All, I am invoking the MSXML java parser through the XMLDSO class which I am embeding in the tag in my HTML page. '' + '' + ''; In my tag itself I am loading the XML file (magic.xml). But I also want to call the DTD associated with the magic.xml file so in the beginning of my XML(magic.xml) I gave But, now when I run my HTML page which embeds the MSXML in the tag the MSXML throws the following exception :- "Cannot find External DTD Magic.dtd" (The Magic.dtd file is present and is the same direc. as magic.xml) Why is the MSXML not being able to access the external DTD (Magic.dtd)? Thanx in advance, Regards, AMIT xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Mon May 18 17:42:00 1998 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:01:24 2004 Subject: SAX 1.0 : handleData and processingInstruction In-Reply-To: <355F110F.E39AD593@sophia.inria.fr> References: <355F110F.E39AD593@sophia.inria.fr> Message-ID: * Philippe Le H�garet | | charecters takes an array of characters for the data : | | public abstract void characters(char ch[], | int start, | int length) throws SAXException | | processingInstruction takes a String for the data : | | public abstract void processingInstruction(String target, | String data) throws SAXException | | Why is there this difference in SAX ? I wasn't "present" when this decision was made, but I'd guess characters is the way it is because this way is faster in most implementations, since most parsers keep a character buffer during parsing and this avoids unecessary buffer copying. However, doing it this way is obviously less convenient for the user, who would prefer to just get a String and be done with it. I guess this is why the current form of processingInstruction was chosen, since processing instructions rarely occur often in documents and thus performance is less of an issue. See also: -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From syost at rational.com Mon May 18 21:50:18 1998 From: syost at rational.com (Steve Yost) Date: Mon Jun 7 17:01:24 2004 Subject: Diff/Merge tools? Message-ID: <00a901bd8296$38fb71f0$ad395ec7@ratskellar.atria.com> Is anyone aware of existing or in-development diff/merge tools, GUI-based or command-line based, for XML? I'd like to make use of one if it exists, and possibly build one otherwise. -Steve Yost xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Mon May 18 22:32:30 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:01:25 2004 Subject: do(duration){display video} In-Reply-To: <3.0.32.19980514213837.015e6bb0@interlog.com> Message-ID: <3.0.1.16.19980518211743.28973dec@pop3.demon.co.uk> At 21:38 14/05/98 -0400, Rolande Kendal wrote: >Still waiting for Jumbo2... RSN - my modem crashed. (Lightning hit it). I hacked a simple distribution today. I had hoped to get it out on the Net before Paris but shall proably have to wait till next weekend. Have tested it on a 10,000 element file (600 Kbytes) and it runs OK on my 166/32Mbytes. I have some minor problems on largish files - DXP seems to run out of memory (does it build a tree?) and I have a problem with another parser. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Mon May 18 22:34:47 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:01:25 2004 Subject: Xlink semantics In-Reply-To: <3.0.5.32.19980515120249.00a257b0@village.promanage-inc.com > References: <98May14.012336edt.26885@thicket.arbortext.com> <3.0.1.16.19980513204547.5bdf6c46@pop3.demon.co.uk> Message-ID: <3.0.1.16.19980518210830.2a5f7e80@pop3.demon.co.uk> At 12:02 15/05/98 -0400, Eve L. Maler wrote: >This is a really good issue. Obviously, the interactions among the >XLink-aware elements are underspecified! I agree with Masatomo Goto's >interpretation of the various element configurations. A few other comments >below... This is most helpful. After thinking about Masatomo Goto's reply I realised that some of the things I had thought were semantically incorrect were useful. I have been thinking that we need something rather like an API - or at least a guide - for people writing link processor tools. I will try to work out what is - and what isn't - allowed in the spec and maybe if we meet in Paris have a discussion. BTW I think XLink is incredibly powerful and if we can develop abstract processing machinery will revolutionise a lot of what I and others want to do. My personal prejudice will be to look at XLink first before other ways of solving problems of inheritance, relations, networks, etc. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Mon May 18 22:38:20 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:01:25 2004 Subject: SAX: White space after last element In-Reply-To: <199805141242.IAA00235@unready.microstar.com> References: <000a01bd7f24$02de1160$1e09e391@mhklaptop.bra01.icl.co.uk> <000a01bd7f24$02de1160$1e09e391@mhklaptop.bra01.icl.co.uk> Message-ID: <3.0.1.16.19980518210036.304749ea@pop3.demon.co.uk> At 08:42 14/05/98 -0400, David Megginson wrote: >SAX is very new, and I am very tired, after a difficult week (of which Sympathies. I know what it's like. Came back expecting a few hours before moving on only to find that lightning had blown the modem. Bought a new one. Spent half the night trying to make it work. No luck. Had to swop it the next day. Exhausted. >SAX 1.0 was only a tiny part). When I am better rested, I will take >some time to test the different SAX implementations and to work with >the authors to resolve any problems, or to take suggestions for >clarifying the interface. While Java interfaces and JavaDoc comments >are useful, they are no substitute for a proper written specification, >which I owe to all of you as soon as I can manage it. This is a novel use of the word 'owe'. We owe you a great deal. I was talking to some people in real life and they were saying how much they appreciated SAX because it provided just what they wanted. Documentation is never fun while you are writing it, but there is a little bit of pleasure when you have finished it and can - metaphorically - put it in a binder. P. > > > >All the best, > > >David > >-- >David Megginson david@megginson.com > http://www.megginson.com/ > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cfranks at microsoft.com Tue May 19 01:54:43 1998 From: cfranks at microsoft.com (Charles Frankston) Date: Mon Jun 7 17:01:25 2004 Subject: parser for xml-data? Message-ID: rbourret@dvs1.informatik.tu-darmstadt.de wrote: > One possibility is that something in your XML document, such > as an attribute at > the root, would refer to the XML document containing the > XML-Data definition of > your grammar: > > > ... > > Another (uglier) possibility is that you use namespaces: the > XML-Data namespace > and the namespace your XML-Data data defines. I haven't > looked enough at either > the namespaces or XML-Data specs to be sure how this would > work, but it seems > the object structure might be something like: > > ... > Namespaces are intended to what you're asking for. I.e.: I don't see why using a standardized solution is uglier than inventing your own namespace tag. This does require you to namespace qualify your instance information. I.e. the tags that come from the urn:mycompany:MyRootSchema would be something like . xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cfranks at microsoft.com Tue May 19 03:50:41 1998 From: cfranks at microsoft.com (Charles Frankston) Date: Mon Jun 7 17:01:25 2004 Subject: parser for xml-data? Message-ID: > -----Original Message----- > From: Rick Jelliffe [mailto:ricko@allette.com.au] > Sent: Friday, May 08, 1998 5:58 PM > To: xml-dev@ic.ac.uk > Subject: Re: parser for xml-data? > > > From: Ron Bourret > > > The only major difference I have found so far for XML is > that the elements > in > > XML documents are ordered, while data members in OO > programming languages > are > > not. > > There is an error in the (January 5?) XML-data report, in the > very first > example. > It gives the clear impression that XML-data does not > constrain sequence for > the element types it describes. In the example, the > declarations for the > element > types which can appear as the content of an element types are > given in one > order, but the instance has them in another order. (A Microsoft > representative > pointed this out to me: I dont know why they haven't just > reissued the note, > since it > is a fairly cricitical point for implementors.) Well, it's not actually a note, Rick, it's a submission to the W3C. But yes, the errors should be corrected, presumably by a re-submission. There are more, mostly less serious, errors and typos that should be corrected as well. > > Also note that the usage of ISO 8601 date formats seems to be wrong. > ISO 8601 date format is yyyy-mm-dd, e.g. 1998-05-09, and not 19980509, > last time I looked. Both 1998-05-09 and 19980509 are legal in ISO 8601 (there's a "full" and a "basic" format, or something like that). However, my current inclination is always to use the full form, i.e. 1998-05-09, as per Misha Wolf's and Charles Wickstead's note: http://www.w3c.org/TR/NOTE-datetime-970915.html. > > If anyone is thinking of implementing XML-data, I suggest > they befriend the > authors, because the report misses out on several key issues. (I have > previously > mentioned that is does not seem to make clear whether you can have an > XML-data schema as part of a document, or whether it must be > external. If it > is internal, can it describe the document's root element? I > suppose a close > reading of the XML-data text might help, but it is not clear > to me after > dozens of readings, but I do not claim to be particularly > brilliant in this > area.) Befriending the authors is always a good idea :-), as is allowing schema information in a document instance. I think the next revision should try to define this. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murata at apsdc.ksp.fujixerox.co.jp Tue May 19 08:50:53 1998 From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto) Date: Mon Jun 7 17:01:25 2004 Subject: Example XML documents in Japanese Message-ID: <199805190652.AA01126@murata.apsdc.ksp.fujixerox.co.jp> Example XML documents in Japanese are available. They are encoded in UTF-16 (big endian and little endian), UTF-8, iso-2022-jp, shift_jis, and euc-jp. One document is the translation of the XML PR, and it will soon be replaced with that of the XML recommendation. The other document is my weekly report and its DTD is also available. This document uses a number of element type names in Japanese. The URL is: http://www.fxis.co.jp/DMS/sgml/xml/charset/xml-japan.zip Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at dvs1.informatik.tu-darmstadt.de Tue May 19 10:44:05 1998 From: rbourret at dvs1.informatik.tu-darmstadt.de (Ron Bourret) Date: Mon Jun 7 17:01:25 2004 Subject: parser for xml-data? Message-ID: <199805190833.KAA09534@berlin.dvs1.tu-darmstadt.de> Charles Frankston wrote: > rbourret@dvs1.informatik.tu-darmstadt.de wrote: > > > One possibility is that something in your XML document, such > > as an attribute at > > the root, would refer to the XML document containing the > > XML-Data definition of > > your grammar: > > > > > > ... > > > > Another (uglier) possibility is that you use namespaces: the > > XML-Data namespace > > and the namespace your XML-Data data defines. I haven't > > looked enough at either > > the namespaces or XML-Data specs to be sure how this would > > work, but it seems > > the object structure might be something like: > > > > ... > > > > Namespaces are intended to what you're asking for. I.e.: > > prefix="myschema" src="http://something/MyRootSchema.xml"?> > > I don't see why using a standardized solution is uglier than inventing your > own namespace tag. This does require you to namespace qualify your instance > information. I.e. the tags that come from the urn:mycompany:MyRootSchema > would be something like . It looked ugly to me not because of using namespaces, but because I couldn't figure out which namespace owned the root tag. My first guess was something like this, which I think is a bit ugly, not to mention invalid: Root object XML-Data root XML-data data... Your data root Your data... I hadn't thought of your solution because I assumed namespace declarations pointed to DTDs, not to any general schema mechanism. Several other questions / comments about namespaces: 1) How do you use multiple namespaces in a valid document? That is, if you have two separate DTDs (schemas), neither of which references elements in the other, how do you build a single valid document with both of them? Elements from the first DTD can't nest inside elements from the second DTD (because they aren't in the second DTD's grammar) and vice versa. The example in section 3.1 of the namespaces spec is well-formed, but the spec doesn't explain how it can be valid. Presumably, it doesn't match any of the DTDs presented as namespaces. 2) The src attribute in your namespace declaration does not point to a DTD; it points to an XML-Data file. While the namespace spec does not prohibit this, I had simply assumed that the schema would be a DTD. It would be nice if the namespace spec clarified that it does not impose any rules on the format of a namespace schema. This is important for validating parsers, as it means that namespace declarations are dependent on the parser's ability to read the particular schema format that is used. (And if a parser can read multiple schema formats, how does it know which one to use?) 3) Why is production [1] in the namespace spec: [1] NamespacePI ::= '' instead of: [1] NamespacePI ::= '' Is the ambiguity of the production, which needs to be qualified with the Required Parts constraint, worth the flexibility in the order of PrefixDef, NSDef, and SrcDef? My opinion is no. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjaakkol at cs.Helsinki.FI Tue May 19 16:46:14 1998 From: jjaakkol at cs.Helsinki.FI (Jani Jaakkola) Date: Mon Jun 7 17:01:25 2004 Subject: Report on the elimination of SGML AND groups (fwd) Message-ID: I'm sure that there are people in this list, who might find this report interesting. However, be prepared for some theoretically oriented computer science. You have been warned. ---------- Forwarded message ---------- Date: Tue, 19 May 1998 12:44:54 +0300 (EET DST) From: Pekka Kilpelainen To: Anne.Brueggemann-Klein@informatik.tu-muenchen.de, dwood@cs.ust.hk, cmsmcq@tigger.cc.uic.edu, marcy@world.std.com, ak117@freenet.carleton.ca Cc: Helena.Ahonen@cs.helsinki.fi, Barbara.Heikkinen@cs.helsinki.fi, Oskari.Heinonen@cs.helsinki.fi, Jani.Jaakkola@cs.helsinki.fi, Pekka.Kilpelainen@cs.helsinki.fi, Greger.Linden@cs.helsinki.fi, Jyrki.Niemi@cs.helsinki.fi, Kimmo.Paasiala@cs.helsinki.fi Subject: Report on the elimination of SGML AND groups Dear colleagues, FYI: I have written a report, where I analyze the possibilities and the lengtehening effect of replacing AND groups of SGML content models by equivalent XML model groups. The report is available as gnu-zipped Postscript through its abstract page at the address http://www.cs.helsinki.fi/~kilpelai/C-1998-12.html . I include at the bottom the abstract of the report. I would be thankful for any comments. Yours, Pekka Kilpelainen Pekka Kilpelainen, University of Helsinki, Dept. of Computer Science Email: Pekka.Kilpelainen@cs.helsinki.fi phone: +358 9 7084 4227, fax: +358 9 7084 4441 http://www.cs.helsinki.fi/~kilpelai --------------------------- SGML & XML Content Models Pekka Kilpel?inen University of Helsinki, Department of Computer Science Report C-1998-12, May 1998 16 pages http://www.cs.helsinki.fi/TR/C-1998-12/ The SGML and XML standards use a variation of regular expressions called content models for modeling the markup structures of document elements. SGML content models may include so called AND groups, which are excluded from XML. An AND group, which is a sequence of subexpressions separated by an &-operator, denotes the sequential catenation of its subexpressions in any possible order. If one wants to shift from SGML to XML in document production, one has to translate SGML content models to corresponding XML content models. The allowed content models in both SGML and XML are restricted by a requirement of determinism, which means that a parser recognizing document element contents has to be able to decide without lookahead, which content model token to match with the current input token, while processing the document from left to right. It is known that not all SGML content models can be expressed as an equivalent XML content model. It is also known that transforming an SGML content model into an equivalent XML content model may cause an exponential growth in the length of the content model. We discuss methods of eliminating AND groups and analyze the circumstances where they can be applied. We derive a tight bound of $e n!$ on the number of symbols in the result of eliminating an AND group of $n$ symbols, where $e = 2.71828...$is the base of natural logarithms. We present the analysis in a pedagogical manner, emphasizing mathematical methods which are typical to the analysis of algorithms. We also show that minimal deterministic automata for recognizing an AND group of $n$ distinct element names contain $2^{n}$ states and $n 2^{n-1}$ transitions, excluding the failure state and transitions leading to it. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jcupp at essc.psu.edu Tue May 19 17:03:21 1998 From: jcupp at essc.psu.edu (Jason R. Cupp) Date: Mon Jun 7 17:01:25 2004 Subject: XML & Postmodernism, was: Separation of formatting... Message-ID: <35618CB3.DA3035C8@essc.psu.edu> Gregg Reynolds wrote: > > Personally I've come around to a pragmatist position on this, after some > time as a radical Free The Text purist. It's just not possible to > encode information without also encoding something about what we're > supposed to do with it, any more than it's possible to draw a "real" > line segment. But we can certainly do some very useful things by > trying. > > An interesting paper on a similar topic is at > http://www.sil.org/sgml/ohco1.html, "Refining Our Notion of What Text > Really Is: The Problem of Overlapping Hierarchies." > -- That's a great paper! Was/is there any attempt to address these issues in XML? -- the idea that there is no unique logical privileged perspective as it relates to the encoding of a text (or at least there shouldn't be); that multiple hierarchical sub perspectives can exist orthogonally within a non-hierarchical perspective. The work of any parser would be to deconstruct whatever perspective he/she wished (through a stylesheet perhaps). What immediately came to mind were namespaces: Mr. Thurston J. Howell , III Where P2,P1 are sub perspectives of P. With a perspective aware parser, this no longer becomes mixed content. If you follow the logic through then there should really be no unique logical API for XML such as SAX or DOM... -- Jason R. Cupp (jcupp@essc.psu.edu) The Pennsylvania State University xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at ora.com Tue May 19 17:20:29 1998 From: crism at ora.com (Chris Maden) Date: Mon Jun 7 17:01:25 2004 Subject: Empty End Tags Considered Confusing Message-ID: <199805191516.LAA07380@ruby.ora.com> I think it was xml-dev that was discussing this; the archive is down (Henry, are you listening?). I was pretty agnostic on the matter; I won't use them, but can cope with them. But I got this message from O'Reilly's German office today. They're a pretty clued group of people, but a book rife with empty end-tags had them stumped: ------- Start of forwarded message ------- Date: Tue, 19 May 1998 12:27:56 +0200 (MET DST) To: Chris Maden From: ... <...@ora.de> Subject: (SGML) end tags in Linux Device Drivers Hi Chris, in "Linux Device Drivers" the sgml files contain as end tags for everything instead of specific end tags (like , , etc.) Is this only because this book was written in SGML by the author or will we have this kind of end tag in future sgml files from FrameMaker->SGML conversion. I ask because we have a simple Perl script which produces readable HTML files (for reviewers, proof readers etc.). It has worked fine for us during the last projects. With as an unspecific end tag we cannot use this script. It's not a problem for our translation of Device Drivers (the typesetter can handle it) - I just would like to know which end tags will be used for the next projects (and if it would be worth the effort to update our Perl script). Regards, ... ------- End of forwarded message ------- Now, the Perl script could have been updated, but to handle the general case (even in an XML+empty-end-tag world) would have added considerable complexity to what's currently a pretty simple script. They can use spam to get around this, or hack up a custom solution for this simpler case (which is what one of their people did). But it's confusing and even potentially dangerous, without a pretty large chunk of SGML knowledge (ref. _The SGML FAQ Book_; you need to know a lot to avoid *accidentally* using a feature). I have to side with Jon Bosak et al. that empty end-tags are potentially a sizable amount of trouble, and that the perceived value is negligible in a compressed-transfer world. -Chris -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cfranks at microsoft.com Tue May 19 20:20:16 1998 From: cfranks at microsoft.com (Charles Frankston) Date: Mon Jun 7 17:01:25 2004 Subject: parser for xml-data? Message-ID: > -----Original Message----- > From: rbourret@dvs1.informatik.tu-darmstadt.de > [mailto:rbourret@dvs1.informatik.tu-darmstadt.de] > Sent: Tuesday, May 19, 1998 1:34 AM > To: xml-dev@ic.ac.uk; Charles Frankston > Subject: RE: parser for xml-data? > 1) How do you use multiple namespaces in a valid document? > That is, if you have > two separate DTDs (schemas), neither of which references > elements in the other, > how do you build a single valid document with both of them? > Elements from the > first DTD can't nest inside elements from the second DTD > (because they aren't in > the second DTD's grammar) and vice versa. The example in > section 3.1 of the > namespaces spec is well-formed, but the spec doesn't explain > how it can be > valid. Presumably, it doesn't match any of the DTDs > presented as namespaces. DTDs are not well-equipped to handle namespaces. It can technically be done: for example, you could allow your outer DTD to have 'ANY' content. XML-Data schemas are designed to integrate with namespaces: > > 2) The src attribute in your namespace declaration does not > point to a DTD; it > points to an XML-Data file. While the namespace spec does > not prohibit this, I > had simply assumed that the schema would be a DTD. It would > be nice if the > namespace spec clarified that it does not impose any rules on > the format of a > namespace schema. This is important for validating parsers, > as it means that > namespace declarations are dependent on the parser's ability > to read the > particular schema format that is used. (And if a parser can > read multiple > schema formats, how does it know which one to use?) > Most of the XML-Data spec describes the rules of the XML-Data schema file. The schema happens to use XML instance syntax, rather than the separate grammar approach of DTDs. We think this is a big advantage -- you can use all the tools you have for editing XML instance data to edit XML-Data schemas. > 3) Why is production [1] in the namespace spec: > > [1] NamespacePI ::= ' | SrcDef))+ '?>' > > instead of: > > [1] NamespacePI ::= ' (S SrcDef)? '?>' > > Is the ambiguity of the production, which needs to be > qualified with the > Required Parts constraint, worth the flexibility in the order > of PrefixDef, > NSDef, and SrcDef? My opinion is no. The ns, prefix, and src parameters to a namespace PI look a lot like attributes (although they are not in a formal sense). Since attributes in XML do not have to be in a particular order, it would certainly be surprising for people to discover that attributes in a namespace have to be a particular order. You're suggesting that the syntax be made harder to use in order to make the productions easier to author. I think this is a bad tradeoff. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simpson at polaris.net Wed May 20 18:43:50 1998 From: simpson at polaris.net (John E. Simpson) Date: Mon Jun 7 17:01:25 2004 Subject: Comments, parsers, XPointers In-Reply-To: Message-ID: <3.0.3.32.19980520124217.00bd928c@nexus.polaris.net> >From previous (sometimes contentious) discussions on this list, I gather that there's no provision in SAX for passing document to a downstream application. The XPointer WD, production 12, allows for relative addressing by node type (element, PI, and so on), *including* comments. Does the downstream invisibility of comments imply that a URL like this will always fail? somedoc.xml#child(1,#comment) Thanks for any insights, John John E. Simpson | It's no disgrace t'be poor, simpson@polaris.net | but it might as well be. | -- "Kin" Hubbard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at ora.com Wed May 20 19:27:41 1998 From: crism at ora.com (Chris Maden) Date: Mon Jun 7 17:01:25 2004 Subject: Comments, parsers, XPointers In-Reply-To: <3.0.3.32.19980520124217.00bd928c@nexus.polaris.net> (simpson@polaris.net) Message-ID: <199805201726.NAA16393@ruby.ora.com> [John E. Simpson] > From previous (sometimes contentious) discussions on this list, I > gather that there's no provision in SAX for passing document to a downstream application. > > The XPointer WD, production 12, allows for relative addressing by > node type (element, PI, and so on), *including* comments. > > Does the downstream invisibility of comments imply that a URL like > this will always fail? > somedoc.xml#child(1,#comment) > > Thanks for any insights, Insight: XML != SAX. A purely SAX implementation will not see comments, so an XPointer implementation based on it will fail with that pointer. However, a different XML parser may still have the comments, and an XPointer based on it may well find them. A complete XPointer implementation would need to have access to the comments, though I'm not entirely sure what the purpose would be (except possibly for editing applications). -Chris -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Wed May 20 19:32:37 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:01:26 2004 Subject: Comments, parsers, XPointers In-Reply-To: <3.0.3.32.19980520124217.00bd928c@nexus.polaris.net> References: <3.0.3.32.19980520124217.00bd928c@nexus.polaris.net> Message-ID: <198001011834.NAA00512@unready.megginson.com> John E. Simpson writes: > >From previous (sometimes contentious) discussions on this list, I gather > that there's no provision in SAX for passing document to > a downstream application. > > The XPointer WD, production 12, allows for relative addressing by node type > (element, PI, and so on), *including* comments. > > Does the downstream invisibility of comments imply that a URL like this > will always fail? > somedoc.xml#child(1,#comment) With SAX level 1, it does. I had a very interesting e-mail discussion with Eve and Steve about the whole issue of pointing at comments, and it will interesting to see how that develops in the next draft. The DOM includes the comments to support authoring tools, but I imagine that many (most?) DOM builders will simply discard the comments before constructing the tree (comments have no purpose on the production/browsing side). In the longer term, what we need is an official definition of an XML information set, specifying (for example) that reporting comments is optional, while reporting the start and end of elements is required. Once such a beastie exists, many vexing questions about (and inconsistencies among) the DOM, XPointers, SAX, XSL, etc. will disappear. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Wed May 20 19:52:43 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:01:26 2004 Subject: Comments, parsers, XPointers Message-ID: <017601bd8418$40dabb00$1e09e391@mhklaptop.bra01.icl.co.uk> >The XPointer WD, production 12, allows for relative addressing by node type >(element, PI, and so on), *including* comments. > Even more contentious is that it allows access to chunks of CDATA. XPointer (and to some extent the DOM as well) in my view fails to recognise that XML defines two object models, a logical model and a physical model. SAX quite clearly and explicitly gives you access to the logical model only, whereas XPointer and DOM are rather ambiguous about the distinction. In my view, if XPointer is intended as a mechanism for defining relationships and underpinning hyperlinks, then it should only allow reference to objects in the logical view only. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Wed May 20 20:38:08 1998 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 17:01:26 2004 Subject: Proposal Announcement - XML DTDs to XML docs Message-ID: I'd like to announce the posting of an unofficial and incomplete proposal for the representation of XML DTDs as XML documents at: http://members.aol.com/simonstl/xml While it lacks _any_ official standing, I hope folks will take a look at it and consider its possibilities. It has some overlap (and I think eventual compatibility) with XML-Data, but it aims at a much smaller target. This proposal is more interested in making XML a friendlier and more self-consistent environment than in creating complex schemas for XML content. All comments, positive and negative, are welcome. Unlike my other essays, this one is explicitly _not_ copyrighted; I'd really like to see these ideas flower, even as part of someone else's proposal (as is probably necessary). If anyone is interested in contributing, I'd like very much to expand this. Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simpson at polaris.net Wed May 20 21:23:31 1998 From: simpson at polaris.net (John E. Simpson) Date: Mon Jun 7 17:01:26 2004 Subject: Comments, parsers, XPointers In-Reply-To: <198001011834.NAA00512@unready.megginson.com> References: <3.0.3.32.19980520124217.00bd928c@nexus.polaris.net> <3.0.3.32.19980520124217.00bd928c@nexus.polaris.net> Message-ID: <3.0.3.32.19980520152214.00bd52d4@nexus.polaris.net> At 01:34 PM 1/1/80 -0500, you wrote: > [various helpful stuff] >(comments have no purpose on the production/browsing side). I was wondering about that. Specifically, I was wondering about a "view source"-type feature in XML browsers, perhaps with a show/hide comments toggle. I've no experience with SGML, but I (and a lot of -- most? -- other HTMLites) have frequently made use of other developers' comments for learning purposes, outside the context of formal training. Assume there were some convention for associating a comment with a particular element or other component of the logical (even physical) model (probably a big assumption!); some facility for locating element X's comment, if any, would be extremely helpful for such purposes. A docname.xml#child(1,elementname).(1,#comment) sort of XPointer seems like a natural construct in this case. Of course excessive comments add to network load, invite abuse and so on (I still get the willies when I see scripting embedded in HTML comments); but they aren't always noise, and aren't always useful only to their author. >In the longer term, what we need is an official definition of an XML >information set, specifying (for example) that reporting comments is >optional, while reporting the start and end of elements is required. >Once such a beastie exists, many vexing questions about (and >inconsistencies among) the DOM, XPointers, SAX, XSL, etc. will >disappear. Yes to all! Thanks, David. John John E. Simpson | It's no disgrace t'be poor, simpson@polaris.net | but it might as well be. | -- "Kin" Hubbard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simpson at polaris.net Wed May 20 21:28:57 1998 From: simpson at polaris.net (John E. Simpson) Date: Mon Jun 7 17:01:26 2004 Subject: Comments, parsers, XPointers In-Reply-To: <199805201726.NAA16393@ruby.ora.com> References: <3.0.3.32.19980520124217.00bd928c@nexus.polaris.net> Message-ID: <3.0.3.32.19980520152809.00bd1580@nexus.polaris.net> At 01:26 PM 5/20/98 -0400, Chris Maden wrote: >[John E. Simpson] >> Thanks for any insights, > >Insight: XML != SAX. Heh. Thanks. >A complete XPointer >implementation would need to have access to the comments, though I'm >not entirely sure what the purpose would be (except possibly for >editing applications). Well, as I just mentioned in a reply to DavidM, it's certainly not *critical* that comments be made available to downstream apps other than editors. But they seem to serve some valid purposes as well -- not necessarily just of the "remind the author what this next thing does." John John E. Simpson | It's no disgrace t'be poor, simpson@polaris.net | but it might as well be. | -- "Kin" Hubbard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Wed May 20 21:52:20 1998 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 17:01:26 2004 Subject: Comments, parsers, XPointers In-Reply-To: <3.0.3.32.19980520152214.00bd52d4@nexus.polaris.net> References: <3.0.3.32.19980520124217.00bd928c@nexus.polaris.net> <198001011834.NAA00512@unready.megginson.com> <3.0.3.32.19980520152214.00bd52d4@nexus.polaris.net> Message-ID: <198001012046.PAA01198@unready.megginson.com> John E. Simpson writes: > At 01:34 PM 1/1/80 -0500, you wrote: > >(comments have no purpose on the production/browsing side). > I was wondering about that. Specifically, I was wondering about a > "view source"-type feature in XML browsers, perhaps with a > show/hide comments toggle. I've no experience with SGML, but I > (and a lot of -- most? -- other HTMLites) have frequently made use > of other developers' comments for learning purposes, outside the > context of formal training. I can imagine as many as three "view" features for XML browsing beyond the default formatted presentation: 1) View Source: see the raw, unparsed source of document entity, possibly with options for viewing the source of other entities. 2) View Logical Structure: see a tree control showing the element structure of the document, probably based on a DOM. 3) View Physical Structure: see a tree control showing the physical structure of the document, starting at the document entity; 'embed' XLinks might also be included here. It should be possible to invoke "View Source" for any tree node. For the first, there is no need to use an XML parser at all: just show the original entity, comments, whitespace, and all. For the second, comments (and CDATA section boundaries s, etc.) would simply confuse the view by adding too many tree nodes -- I'd want them filtered out even if the underlying DOM builder supported them. For the third, I'd probably be interested only in external entities or XLinks. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Wed May 20 22:08:50 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:01:26 2004 Subject: Proposal Announcement - XML DTDs to XML docs References: Message-ID: <35633897.13A71868@technologist.com> Simon St.Laurent wrote: > > I'd like to announce the posting of an unofficial and incomplete proposal for > the representation of XML DTDs as XML documents at: We discussed this in the XML working group. The idea does have the benefits you list in your proposal. But there are more disadvantages than you have listed, and some hard problems to be solved. Nobody has ever done the work to flesh it out to the point where an XML-document notation is as expressive as the current DTD notation. I believe that people who have not been using the DTD notation for a long time underestimate the extent to which common use depends on weird stuff like parameter entities specifying partial content models, element types in element type declarations and so forth. It might be possible to build an XML-document notation that was as expressive as XML DTDs (or at least expressive enough) and didn't use any text substitution wizardry, but nobody has done that yet. This is not a problem if you are content having your XML-syntax used only by those who build simple DTDs. There are other, more subtle problems. Having a single notation appeals mathematically to those who like consistent, recursive structures, but it is not clear that end-users fall into that category. A user interface (markup languages *are* user interfaces) can be too consistent, if it obscures the differences between things. In the case of documents and DTDs, I expect many users would get confused about the distinction between documents and DTDs if DTDs *were* documents. There is also the issue of compatibility. If the DTD for DTDs is extensible and open, as most proponents argue it should be, then Microsoft, Netscape and Sun can all take shots at "extending" it in the way that they "extended" HTML. If the DTD DTD was specifically designed to be extensible, then we could not complain about that. Depending on the level of extensibility, XML documents could actually parse differently depending on which browser you were using. If it was designed NOT to be extensible, then we have to cross one of the benefits of this alternate notation off of the list. Another problem is "specification encapsulation." XML 1.0 is specifically designed NOT to depend on XLink or XPointer. Your proposal depends on them. It seems to me that there is some sort of circularity problem there. Several of my complaints about your proposal stem from the fact that DTDs both change the parse and validate the document. In other words, they are both schemata and "parse information providers". If your XML-instance DTDs only validated, then many of the complaints would go away. But if they only validated, I don't think it would be accurate to call them DTDs anymore. Then they would be just "schemas" since they would accomplish only one of the DTD's two functions. I suspect that these functions will, in fact, become more and more distinct as time goes by. So proposals that keep them together should probably not succeed. Our current DTD syntax is quite reasonable and efficient for the task of declaring entities, setting default attributes and so forth. If we want extensible, XML-instance notation schemata, then we should probably forget about replacing DTDs and just define extensible, XML-instance schemata (i.e. XML-Data) and leave DTDs to do the other tasks. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "A writer is also a citizen, a political animal, whether he likes it or not. But I do not accept that a writer has a greater obligation to society than a musician or a mason or a teacher. Everyone has a citizen's commitment." - Wole Soyinka, Africa's first Nobel Laureate xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Wed May 20 22:12:17 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:01:26 2004 Subject: Proposal Announcement - XML DTDs to XML docs References: Message-ID: <3563395C.8C348359@technologist.com> Simon St.Laurent wrote: > > I'd like to announce the posting of an unofficial and incomplete proposal for > the representation of XML DTDs as XML documents at: We discussed this in the XML working group. The idea does have the benefits you list in your proposal. But there are more disadvantages than you have listed, and some hard problems to be solved. Nobody has ever done the work to flesh it out to the point where an XML-document notation is as expressive as the current DTD notation. I believe that people who have not been using the DTD notation for a long time underestimate the extent to which common use depends on weird stuff like parameter entities specifying partial content models, element types in element type declarations and so forth. It might be possible to build an XML-document notation that was as expressive as XML DTDs (or at least expressive enough) and didn't use any text substitution wizardry, but nobody has done that yet. This is not a problem if you are content having your XML-syntax used only by those who build simple DTDs. There are other, more subtle problems. Having a single notation appeals mathematically to those who like consistent, recursive structures, but it is not clear that end-users fall into that category. A user interface (markup languages *are* user interfaces) can be too consistent, if it obscures the differences between things. In the case of documents and DTDs, I expect many users would get confused about the distinction between documents and DTDs if DTDs *were* documents. There is also the issue of compatibility. If the DTD for DTDs is extensible and open, as most proponents argue it should be, then Microsoft, Netscape and Sun can all take shots at "extending" it in the way that they "extended" HTML. If the DTD DTD was specifically designed to be extensible, then we could not complain about that. Depending on the level of extensibility, XML documents could actually parse differently depending on which browser you were using. If it was designed NOT to be extensible, then we have to cross one of the benefits of this alternate notation off of the list. Another problem is "specification encapsulation." XML 1.0 is specifically designed NOT to depend on XLink or XPointer. Your proposal depends on them. It seems to me that there is some sort of circularity problem there. Several of my complaints about your proposal stem from the fact that DTDs both change the parse and validate the document. In other words, they are both schemata and "parse information providers". If your XML-instance DTDs only validated, then many of the complaints would go away. But if they only validated, I don't think it would be accurate to call them DTDs anymore. Then they would be just "schemas" since they would accomplish only one of the DTD's two functions. I suspect that these functions will, in fact, become more and more distinct as time goes by. So proposals that keep them together should probably not succeed. Our current DTD syntax is quite reasonable and efficient for the task of declaring entities, setting default attributes and so forth. If we want extensible, XML-instance notation schemata, then we should probably forget about replacing DTDs and just define extensible, XML-instance schemata (i.e. XML-Data) and leave DTDs to do the other tasks. I would propose the following levels: XML parser (works with document and standard XML DTD) XML Link/XML Pointer engine (annotates parse tree with linking info) XML Schema Engine 1 (validates and modifies annotated parse tree) XML Schema Engine 2 (validates and modifies annotated parse tree) ... XML Schema Engine N (validates and modifies annotated parse tree) XML Application There is no level circularity here. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "A writer is also a citizen, a political animal, whether he likes it or not. But I do not accept that a writer has a greater obligation to society than a musician or a mason or a teacher. Everyone has a citizen's commitment." - Wole Soyinka, Africa's first Nobel Laureate xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Wed May 20 22:25:17 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:01:26 2004 Subject: Comments, parsers, XPointers References: <017601bd8418$40dabb00$1e09e391@mhklaptop.bra01.icl.co.uk> Message-ID: <35633C33.B60CF1CD@technologist.com> Michael Kay wrote: > > XPointer (and to some extent the DOM as well) in my view > fails to recognise that XML defines two object models, a > logical model and a physical model. SAX quite clearly and > explicitly gives you access to the logical model only, > whereas XPointer and DOM are rather ambiguous about the > distinction. XML has no semantics and thus no object models. :) I am only half kidding. People are confused because the XML REC is confused. Where does it say that CDATA sections and comments are part of the physical and not logical structure? I agree that they *should be*, but where does it say that? Until XML's semantics are defined precisely, we must guess at them. John E. Simpson wrote: > > Well, as I just mentioned in a reply to DavidM, it's certainly not > *critical* that comments be made available to downstream apps other than > editors. But they seem to serve some valid purposes as well -- not > necessarily just of the "remind the author what this next thing does." I think that the semantics of comments should be precisely "remind the author what this next thing means." No more and no less. Any use for machine processing is abuse because it is bound to cause the problems you are complaining about. Some parsers will strip them out (as is their right). If you depend on that comment for anything more than source file maintenance, then you are in trouble. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "A writer is also a citizen, a political animal, whether he likes it or not. But I do not accept that a writer has a greater obligation to society than a musician or a mason or a teacher. Everyone has a citizen's commitment." - Wole Soyinka, Africa's first Nobel Laureate xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Kenneth.J.Meltsner at jci.com Wed May 20 22:47:08 1998 From: Kenneth.J.Meltsner at jci.com (Meltsner, Kenneth J) Date: Mon Jun 7 17:01:27 2004 Subject: Alternative syntaxes (was RE: Proposal Announcement) Message-ID: <8625660A.0070EF73.00@Corpnotes.JCI.Com> There's a great prototype for "Adaptive Forms" at: http://www.isi.edu/~mm-proj/ The developers put together a Java program that takes a grammar for a "natural language-ish" statement, form, rule, query, etc. and provides a form-based interface that dynamically changes depending on the terms entered by the user. It seems like this would be a good way to enter information for some applications with complex DTDs. I suspect the grammar could be automatically generated (in many cases) directly from the DTD (or vice-versa?). In one of their papers, the authors briefly mention the Texas Instruments natural language shell from more than a decade ago that allowed users to frame complicated queries by selecting appropriate text fragments from a set of lists. This was an exceptionally powerful shell that never reached widespread popularity; perhaps this will be more successful. Ken Meltsner >From the first page: "Adaptive Forms is a tool for producing context-sensitive form-based interfaces. The system initially displays an overview of the main sections of a form, and an initial set of fields for the user to fill in. Depending on the values that the user enters, Adaptive Forms progressively adds new fields to the form. For example, a form for entering household information would show the user fields for entering the spouse's name only if the user had entered "married" in the "marital status" field. "The main design goal for Adaptive Forms is in entering structured information rapidly and without errors. One of our target applications was the specification of air campaign objectives, which are structured objects consisting of a verb (e.g., deny, gain), an aspect (e.g., what to deny or gain), an actor (e.g., country, a branch of armed forces), a location (e.g., a country or a region) and a time period. Each of the parts is itself a structured object whose substructure and possible values depend on the values specified for the other parts. For example, the aspects that can be gained are different from the aspects that can be denied, so the interface needs to compute the menus for the "aspect" field dynamically based on the fillers of other fields. Similar requirements arise in virtually any other application domain. " xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Wed May 20 22:59:49 1998 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 17:01:27 2004 Subject: Proposal Announcement - XML DTDs to XML docs Message-ID: It's good to get feedback. Paul Prescod stated: >Another problem is "specification encapsulation." XML 1.0 is specifically >designed NOT to depend on XLink or XPointer. Your proposal depends on >them. It seems to me that there is some sort of circularity problem there. I think this circularity problem is not too difficult to cope with in practice. It could be reduced, if not disposed of, by decomposing XML 1.0 into three different specs: 1) A document syntax specification (a simplified version of well-formed documents) 2) A syntax for linking to DTDs (and perhaps schemas) internal or external (which would depend on XLink) 3) A syntax for DTDs providing rules for validation. (Schema definitions could rest on top of #3 or beside it.) Even if these seems inadequate, I think it would be useful to reduce the number of ways users and developers reference external resources from XLinks, general entities, parameter entities, notations, and assorted other goodies to a single URI-based spec with consistent notation, preferably XLink. >Nobody has ever done the work to flesh it out to the point where an >XML-document notation is as expressive as the current DTD notation. I >believe that people who have not been using the DTD notation for a long >time underestimate the extent to which common use depends on weird stuff >like parameter entities specifying partial content models, element types >in element type declarations and so forth. I realize that these are significant challenges, and that conversion to an XML document format for DTDs doesn't whisk them away. What I'd like to see is a determined effort to map the 'weird stuff' to XML DTDs. Note, for instance, that I kept the SGML-like content model in my examples. It's sort of a half-way step, leaving in some of the old that the new may prosper. As for parameter entities, I think they can be expressed as easily through the model I presented. Fleshing this out is admittedly a large task that I don't think I can manage alone. >A user interface >(markup languages *are* user interfaces) can be too consistent, if it >obscures the differences between things. In the case of documents and >DTDs, I expect many users would get confused about the distinction between >documents and DTDs if DTDs *were* documents. This is a significant consideration I hadn't noticed, and I'll address it more directly in the next revision. I don't think it's insuperable. >There is also the issue of compatibility. If the DTD for DTDs is >extensible and open, as most proponents argue it should be, then >Microsoft, Netscape and Sun can all take shots at "extending" it in the >way that they "extended" HTML. If the DTD DTD was specifically designed to >be extensible, then we could not complain about that. Depending on the >level of extensibility, XML documents could actually parse differently >depending on which browser you were using. If it was designed NOT to be >extensible, then we have to cross one of the benefits of this alternate >notation off of the list. I would certainly want this to be extensible; parsers that didn't understand an extended portion of this DTD could simply ignore that portion, provided the document met the basic rules. I don't see this as a significant problem, especially after looking at several of the schemas other people are proposing for XML documents. XML is going to fragment to a certain extent; I'd like to see the core made more extensible to provide an orderly framework for such extensions rather than letting them run off on their own. >Several of my complaints about your proposal stem from the fact that DTDs >both change the parse and validate the document. In other words, they are >both schemata and "parse information providers". If your XML-instance DTDs >only validated, then many of the complaints would go away. But if they >only validated, I don't think it would be accurate to call them DTDs >anymore. Then they would be just "schemas" since they would accomplish >only one of the DTD's two functions. I think you miss part of the point - that this is a representation intended to completely represent the same data that is provided currently by XML DTDs. You seem to assume that there will be data lost in the transition, without pointing out where it would be lost. I think you could build the same schemas and parse information with this system as you could with the current XML DTD structure, while providing the extensibility that several other proposals seem to need. I would rather _not_ provide full schema information in this proposal - moving XML DTDs to a new format seems like enough of a task to start with. In the long run, of course, DTD would be an inadequate term to describe these. Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Thu May 21 00:24:20 1998 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:01:27 2004 Subject: Proposal Announcement - XML DTDs to XML docs References: Message-ID: <356357DE.47BA453E@allette.com.au> Simon St.Laurent wrote: > What I'd like to see is a > determined effort to map the 'weird stuff' to XML DTDs. Note, for instance, > that I kept the SGML-like content model in my examples. It's sort of a > half-way step, leaving in some of the old that the new may prosper. As for > parameter entities, I think they can be expressed as easily through the model > I presented. What about issues such as redefining a parameter entity in the external subset? Or multiple identically named parameter entities within the DTD? These seem like issues where you would either have to lose information from the DTD or define behaviour for the XML DTD that doesn't apply to other XML documents. If that were the case, I would question the benefit of the proposed syntax. Sticking with parameter entities, what if the same entities behaved as a content model and an attribute name? How and why would you make the distinction with the proposed syntax? I think that the syntax that describes the structure of documents can validly be different from the syntax that frames data because they're trying to accomplish very different things. -- Regards Marcus Carr email: mrc@allette.com.au _______________________________________________________________ Allette Systems (Australia) email: info@allette.com.au Level 10, 91 York Street www: http://www.allette.com.au Sydney 2000 NSW Australia phone: +61 2 9262 4777 fax: +61 2 9262 4774 _______________________________________________________________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Thu May 21 01:45:21 1998 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 17:01:27 2004 Subject: Proposal Announcement - XML DTDs to XML docs Message-ID: >What about issues such as redefining a parameter entity in the external subset? Or >multiple identically named parameter entities within the DTD? These could both be defined by specifying the behavior for the DTD-as-document, just as they are now for DTDs. I see no difficulty here. >Sticking with >parameter entities, what if the same entities behaved as a content model and an >attribute name? How and why would you make the distinction with the proposed >syntax? How and why would you make the distinction with the current syntax? Why would this be so difficult to do in a document rather than the current syntax? Are you saying that it would be crossing element boundaries and therefore break the well-formedness requirements? The syntax is currently (obviously) incomplete; I'll see what I can do to address this issue. I've obviously done an inadequate job here. Parameter entities are admittedly my least favorite part of XML, a necessary evil and a powerful tool. There may well be limits on how well they can map to this model - but is that a significantly worse limitation than the abolition of the & content model? I think the manageability you'd gain with this representation of XML DTDs would more than compensate for any loss incurred by the enforced simplification of parameter entities. >I think that the syntax that describes the structure of documents can validly be >different from the syntax that frames data because they're trying to accomplish >very different things. To a certain extent, this is certainly true. However, I think there's a strong case to be made for using a single syntax - see the advantages listed in the Rationale. I'm very happy with the document syntax XML inherited from SGML, particularly as XML made that syntax much more strictly enforced. I'm not as happy with the DTD syntax - and this seems like a good way to take advantage of the power of XML's document syntax. The current DTD syntax is workable - but not very extensible. I see a lot of effort being put into schemas and other projects that seem to add additional layers of complexity, and require applications to implement all kinds of extra linkages. By standardizing the linkage mechanism and the format for these extensions (as XLink or a derivative and XML documents, respectively), I hope to see a lot less EBNF and a lot more XML. Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu May 21 02:26:32 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:01:27 2004 Subject: Proposal Announcement - XML DTDs to XML docs References: Message-ID: <35637063.FE3EF5A7@technologist.com> Simon St.Laurent wrote: > > It's good to get feedback. > > 1) A document syntax specification (a simplified version of well-formed > documents) > 2) A syntax for linking to DTDs (and perhaps schemas) internal or external > (which would depend on XLink) > 3) A syntax for DTDs providing rules for validation. > (Schema definitions could rest on top of #3 or beside it.) So at what level do I get the equivalent of internal entities and defaulted attributes? And what levels are required of all XML processors vs. optional? > I would certainly want this to be extensible; parsers that didn't understand > an extended portion of this DTD could simply ignore that portion, provided the > document met the basic rules. The important point is that you aren't talking about an alternate notation for XML DTDs, but a complete change in the relationship between DTDs and documents. In XML, DTDs can affect the interpretation (not just validation) of a document, through entities, defaulted attributes, element-content vs. mixed-content and so forth. If DTDs can both change the way a document is parsed *and* be extensible, then two parsers could get completely different information out of the same document. For example: One company's DTD extension could add in SGML tag ommission. The start- and end-tag of an element could be implied, without violating well-formedness. So then you could use that company's parser through SAX and get a completely different set of events than if you used someone else's parser. After all, changing the parse is one of the responsibilities of the DTD. > I would rather _not_ provide full schema information in this > proposal - moving XML DTDs to a new format seems like enough of a task to > start with. I don't know what you mean by full schema information. DTDs serve as schemas (in addition to changing the parse). If you propose to replace DTDs, then you are in part designing a new schema language. My suggestion is to develop a new schema language *without* changing DTDs. In other words, I am suggesting you make your project smaller, not larger. I would suggest you forget about entities, defaulted attributes, etc. Leave those to DTDs. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "A writer is also a citizen, a political animal, whether he likes it or not. But I do not accept that a writer has a greater obligation to society than a musician or a mason or a teacher. Everyone has a citizen's commitment." - Wole Soyinka, Africa's first Nobel Laureate xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Thu May 21 03:06:29 1998 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 17:01:27 2004 Subject: Proposal Announcement - XML DTDs to XML docs Message-ID: Paul Prescod wrote: >> 1) A document syntax specification (a simplified version of well-formed >> documents) >> 2) A syntax for linking to DTDs (and perhaps schemas) internal or external >> (which would depend on XLink) >> 3) A syntax for DTDs providing rules for validation. > >So at what level do I get the equivalent of internal entities and >defaulted attributes? And what levels are required of all XML processors >vs. optional? At what level do you get defaulted attributes now? Do you get defaulted attributes in a well-formed document without a DTD? Right now, it doesn't look like it. This could be in level 1, if default attributes were deemed necessary to document syntax, but I'd expect to see it in level 3. Internal entities could be defined much as they are now, at the start of a document, within a structure set aside for that purpose using (or whatever develops) instead of . This would indeed need to be covered in level 1, unless you could live without internal entities. In the past you seemed quite happy about forcing scripts to be external to a document, so I can't see why it would be so terrible to exile entities - and DTDs as well - to separate documents either. I don't think it would be necessary, though, any more than it's necessary now. As for requiring levels, level 1 would serve a similar purpose to well-formed documents today. 2 would be a prerequisite for 3, of course. >For example: One company's DTD extension could add in SGML tag ommission. >The start- and end-tag of an element could be implied, without violating >well-formedness. So then you could use that company's parser through SAX >and get a completely different set of events than if you used someone >else's parser. After all, changing the parse is one of the >responsibilities of the DTD. I think this is overstating your case rather dramatically. I could do something similarly brutal by creating a PI at the start of a regular XML document and using the implied tags. No one else could read my documents, but I sure could. Not only that, but I already proposed separating the document syntax - which includes full start- and end-tags - from the DTD. There's no reason this proposal would allow the DTD to modify the basic document syntax and markup, period. >I don't know what you mean by full schema information. DTDs serve as >schemas (in addition to changing the parse). If you propose to replace >DTDs, then you are in part designing a new schema language. We can argue about the meaning of the word schema all you like; it's not that exciting for me. XML-Data performs similar mapping, but attempts to add a lot more, parts which I see more as data schemas. If this is a schema, then so be it. >My suggestion >is to develop a new schema language *without* changing DTDs. In other >words, I am suggesting you make your project smaller, not larger. I would >suggest you forget about entities, defaulted attributes, etc. Leave those >to DTDs. My suggestion is that DTD's present a significant problem in their current format, and that they could be improved significantly. I would enjoy being able to focus on elements and attributes, the core of XML (and SGML) document syntax, and worry less about the rest. This project already is an attempt to be smaller, but to provide a place for new things to grow. Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Thu May 21 04:20:25 1998 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:01:27 2004 Subject: Proposal Announcement - XML DTDs to XML docs References: Message-ID: <35638F2E.502A50D@allette.com.au> Simon St.Laurent wrote: > > What about issues such as redefining a parameter entity in the > > external subset? Or multiple identically named parameter > > entities within the DTD? > > These could both be defined by specifying the behavior for the > DTD-as-document, just as they are now for DTDs. I see no difficulty here. It seems that you are proposing ascribing characteristics to an entire class of XML documents, that is, when you encounter a particular element you ignore it based on the existence of a previous element's characteristics. That's essentially the behaviour of parameter entities, but is a semantic not automatically imposed on other XML documents. In my opinion, this difference is significant grounds for not making DTDs XML. > > Sticking with parameter entities, what if the same entities > > behaved as a content model and an attribute name? > > How and why would you make the distinction with the proposed > > syntax? > > How and why would you make the distinction with the current syntax? Why would > this be so difficult to do in a document rather than the current syntax? You wouldn't currently make the distinction, but you don't need to. In your example you indicated that a the parameter entity resolved to a content model (sorry, I'm working from memory). Assuming that a parameter entity is just a straight text replacement, you probably shouldn't make any determination about it's use at the time of declaration, though this may just be a syntactic issue. > Parameter entities are admittedly my least favorite part of XML, a necessary > evil and a powerful tool. There may well be limits on how well they can map > to this model - but is that a significantly worse limitation than the > abolition of the & content model? I think the manageability you'd gain with > this representation of XML DTDs would more than compensate for any loss > incurred by the enforced simplification of parameter entities. Having done some work with a Docbook derivative lately, I would argue that yes, messing with parameter entities could be more damaging for less gain than the abolition of the & model. I suppose I come from the camp that prefers no change, but I don't really find manageability a major issue - if I want a different view of the DTD, I could always convert it to HTML or use any number of applications to present it nicely. If you mean to extend the information stored in a DTD, I would also prefer to see this done within the existing framework as much as is feasible. > To a certain extent, this is certainly true. However, I think there's a > strong case to be made for using a single syntax - see the advantages listed > in the Rationale. I did read it, but I accidently deleted the mail before recording the URL - could you send it to me again? Thanks. -- Regards Marcus Carr email: mrc@allette.com.au _______________________________________________________________ Allette Systems (Australia) email: info@allette.com.au Level 10, 91 York Street www: http://www.allette.com.au Sydney 2000 NSW Australia phone: +61 2 9262 4777 fax: +61 2 9262 4774 _______________________________________________________________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu May 21 12:48:54 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:01:27 2004 Subject: Proposal Announcement - XML DTDs to XML docs References: Message-ID: <356406D3.F69A123F@technologist.com> Simon St.Laurent wrote: > > At what level do you get defaulted attributes now? Do you get defaulted > attributes in a well-formed document without a DTD? No. But if you are talking about replacing the DTD, then I don't see how a comparison to documents without a DTD are relevant. > Internal > entities could be defined much as they are now, at the start of a document, > within a structure set aside for that purpose using (or whatever > develops) instead of . This would indeed need to be covered in level > 1, unless you could live without internal entities. Okay. So is in level 1 -- the document level, just like with DTDs. But presumably can use XLink. So now you've dragged XLink into level 1. Now we are back to specification circularity. Am I missing something here? > In the past you seemed > quite happy about forcing scripts to be external to a document, so I can't see > why it would be so terrible to exile entities - and DTDs as well - to separate > documents either. I don't think it would be necessary, though, any more than > it's necessary now. I have never argued in favour of forcing scripts to be external to a document. I argued that everything I know about text processing says that putting scripts in textual documents is a bad idea -- and is in fact a regression to the technique that SGML was invented to replace. But not everybody uses XML or SGML for text processing, so I do not believe that the *language* should restrict them from embedding scripts. XSL is a perfect example of an appropriate mix of scripts and markup....but you'll notice that there is essentially no text in an XSL stylesheet. Anyhow, even if we exile entities, you still have the cirularity problem. How can a level 1 parser process entities (as they do now) if the syntax for declaring entities depends on XLink, XPointer, and other specifications that are suppoed to be separate from XML itself. Let me make this concrete (using a random DTDs in XML notation, with old-syntax comments for clarity): foo.xdtd: TEST2 foo.xml: &foo; Does the processor have to go and fetch foo.xdtd, read it and understand it before it can know the contents of this document? > As for requiring levels, level 1 would serve a similar purpose to well-formed > documents today. 2 would be a prerequisite for 3, of course. Well-formed documents can have entities. In fact, all XML documents that have entities are well-formed. > >For example: One company's DTD extension could add in SGML tag ommission. > >The start- and end-tag of an element could be implied, without violating > >well-formedness. So then you could use that company's parser through SAX > >and get a completely different set of events than if you used someone > >else's parser. After all, changing the parse is one of the > >responsibilities of the DTD. > > I think this is overstating your case rather dramatically. I could do > something similarly brutal by creating a PI at the start of a > regular XML document and using the implied tags. No you could not. The semantics of XML DTDs are *fixed*, not extensible. Any parser that interpreted processing instructions as commands to change the parse would be *wrong*. But you propose that DTDs should become extensible. Since DTDs can change the parse (radically, in some cases), your proposal would allow DTD extensions to make documents specific to particular processors, unless an amended proposal explicitly disallows that. > No one else could read my > documents, but I sure could. Not only that, but I already proposed separating > the document syntax - which includes full start- and end-tags - from the DTD. > There's no reason this proposal would allow the DTD to modify the basic > document syntax and markup, period. DTD's don't modify the document syntax and markup, but they do modify the parse tree created by the document. In other words, they modify its semantics. If you replace DTDs with something "extensible", you must expect them to be able to modify the parse tree in extensible ways, unless you explicitly disallow this in your proposal. Just as today's DTDs can have "implied attributes", Microsoft could invent one with "implied elements". Netscape could go in the opposite direction and give us "transparent elements" that do not show up in the parse tree at all. You could use the two parsers through SAX and get completely different parse trees. > My suggestion is that DTD's present a significant problem in their current > format, and that they could be improved significantly. I would enjoy being > able to focus on elements and attributes, the core of XML (and SGML) document > syntax, and worry less about the rest. This project already is an attempt to > be smaller, but to provide a place for new things to grow. If all you are interested in is elements and attributes then you are proposing a new schema language for XML, not a replacement for DTDs (which do more). It is clear that you associate the word schema with complexity, and I can't force you to use it. Schemata constrain the structure of data models (databases, documents, etc.) DTDs are schemata. They also do more. They can change the parse. That's what makes them complex and part of what makes *XML* complex. If you try to do everything that DTDs do, then your new language will also be needlessly complex. If you do not, then you are not replacing DTDs but rather inventing something new. It sounds like a new schema language to me. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "A writer is also a citizen, a political animal, whether he likes it or not. But I do not accept that a writer has a greater obligation to society than a musician or a mason or a teacher. Everyone has a citizen's commitment." - Wole Soyinka, Africa's first Nobel Laureate xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu May 21 12:54:41 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:01:27 2004 Subject: Proposal Announcement - XML DTDs to XML docs References: Message-ID: <35640818.9251DC46@technologist.com> Simon St.Laurent wrote: > Parameter entities are admittedly my least favorite part of XML, a necessary > evil and a powerful tool. There may well be limits on how well they can map > to this model - but is that a significantly worse limitation than the > abolition of the & content model? I don't believe that I have ever written a DTD that *did* use the & occurrence indicator in a content model. I also don't believe that I have ever written a DTD (to be used more than once) that did NOT use parameter entities. If you invent enough subclassing/inheritance/content model reuse features, you may be able to get away without them. But you can't just take them out if you expect people to do serious document modelling in XML. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "A writer is also a citizen, a political animal, whether he likes it or not. But I do not accept that a writer has a greater obligation to society than a musician or a mason or a teacher. Everyone has a citizen's commitment." - Wole Soyinka, Africa's first Nobel Laureate xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu May 21 13:33:58 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:01:27 2004 Subject: parser for xml-data? References: Message-ID: <35640BD5.F781F0BF@technologist.com> Charles Frankston wrote: > > DTDs are not well-equipped to handle namespaces. It can technically be > done: for example, you could allow your outer DTD to have 'ANY' content. This example can be done in DTD syntax in the same way that you do it with namespaces. In fact, namespaces were specifically designed to not break validation. (also, note that the ns pseudo-attribute is supposed to be a URL) > XML-Data schemas are designed to integrate with namespaces: > > > > > > > > > > > > > > > > > > xyz.dtd: zyx.dtd: instance.xml: You don't need "ANY" to use namespaces. > The ns, prefix, and src parameters to a namespace PI look a lot like > attributes (although they are not in a formal sense). Since attributes in > XML do not have to be in a particular order, it would certainly be > surprising for people to discover that attributes in a namespace have to be > a particular order. You're suggesting that the syntax be made harder to use > in order to make the productions easier to author. I think this is a bad > tradeoff. Note that the XML declaration has a required order of pseudo-attribute occurrence. It would be best if the XML-family of language were consistent. [23] XMLDecl ::= '' Paul Prescod - http://itrc.uwaterloo.ca/~papresco "A writer is also a citizen, a political animal, whether he likes it or not. But I do not accept that a writer has a greater obligation to society than a musician or a mason or a teacher. Everyone has a citizen's commitment." - Wole Soyinka, Africa's first Nobel Laureate xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Thu May 21 15:27:07 1998 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 17:01:28 2004 Subject: Proposal Announcement - XML DTDs to XML docs Message-ID: >Okay. So is in level 1 -- the document level, just like with >DTDs. But presumably can use XLink. So now you've dragged XLink >into level 1. Now we are back to specification circularity. > >Am I missing something here? There are several possible answers. You could allow XML Level 1 parsers to ignore external entities if they choose - something similar is already in the spec right now (in a more limited case, section 4.1) for well-formed documents. You could hard-wire the href attribute's interpretation in a DTD - parsers are already dealing with references in the context of DTDs, and it doesn't seem that hard to make sense of href. Another option is to allow the circularity. This message was brought to you (at least partway) by the Internet Protocol, IP, defined in RFC 791. IP includes, and indeed requires, the services of ICMP (defined in RFC 792). ICMP uses IP to get from one place to another. Circular? Yep. Workable? Certainly. IP isn't allowed to generate extra ICMP messages about the delivery of an ICMP message. There is no circle in practice. Nor would there be a circle in _practice_ by allowing the level 1 spec to refer to the hrefs described in XLink, or to simply use href without further consideration. >foo.xdtd: > >TEST2 > > > > > > > > > >foo.xml: > > >&foo; > > >Does the processor have to go and fetch foo.xdtd, read it and understand >it before it can know the contents of this document? No more than it needs to in the current system, as stated in section 4.1: X>Note that if entities are declared in the external subset X>or in external parameter entities, a non-validating processor X>is _not_ _obligated_ _to_ read and process their declarations; X>for such documents, the rule that an entity must be declared X>is a well-formedness constraint only if _standalone='yes'_. >Well-formed documents can have entities. In fact, all XML documents that >have entities are well-formed. In fact, technically, X>A data object is an XML document if it is _well-formed_, X>as defined in this specification >The semantics of XML DTDs are *fixed*, not extensible. >Any parser that interpreted processing instructions as commands to change >the parse would be *wrong*. But you propose that DTDs should become >extensible. Since DTDs can change the parse (radically, in some cases), >your proposal would allow DTD extensions to make documents specific to >particular processors, unless an amended proposal explicitly disallows >that. I think you're dramatically misreading my argument, deliberately making this a bogeyman when it isn't. I see no reason why malicious DTDs would be allowed to 'change the parse' any more than current DTDs would be. Extensible DTDs do _not_ mean that anything goes. Behavior can be proscribed, rules can be set. A DTD in this proposal would be allowed to add things to the the parse, not change the fundamental rules set in level 1. Perhaps I should make this more explicit in the proposal - since the proposal is to 'map' XML DTD syntax to XML document syntax, it seemed reasonable to me that the same strictures demanded for processing an XML DTD would apply here. >DTD's don't modify the document syntax and markup, but they do modify the >parse tree created by the document. In other words, they modify its >semantics. If you replace DTDs with something "extensible", you must >expect them to be able to modify the parse tree in extensible ways, unless >you explicitly disallow this in your proposal. I don't think this is difficult; the types of extensions allowed can be limited to a reasonable set (data types, for instance) and expanded through the standards process when it appears necessary. Not everyone may want to wait, of course, but they'd find a way to get around DTDs anyway. I think you're going to see plenty of ersatz XML in practice anyway - one of the great things about SAX is that people can put _any_ kind of parser underneath it and watch it spit out nice-looking XML on top. As Chris Maden pointed out on another topic, CM>Insight: XML != SAX. >DTDs are schemata. They also do more. They can change the parse. That's >what makes them complex and part of what makes *XML* complex. If you try >to do everything that DTDs do, then your new language will also be >needlessly complex. If you do not, then you are not replacing DTDs but >rather inventing something new. It sounds like a new schema language to >me. Well, we'll see what happens. This proposal is only starting out, and complexities always look simpler at the beginning. Anyone who would like to help me figure out ways of expressing DTDs in XML document syntax is welcome to join this project - and suggestions for making sure that these DTDs don't change the parse in violent ways are also welcome. Even if this solution isn't perfect, it opens up a lot of questions that are worth asking about the current way of doing things. Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Thu May 21 15:30:40 1998 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 17:01:28 2004 Subject: Proposal Announcement - XML DTDs to XML docs Message-ID: >I also don't believe that I have ever written a DTD (to be used more than >once) that did NOT use parameter entities. > >If you invent enough subclassing/inheritance/content model reuse features, >you may be able to get away without them. But you can't just take them out >if you expect people to do serious document modelling in XML. And once again, you're making a much stronger claim than I am. Simply because _I_ don't like parameter entities _doesn't_ mean they're being abolished. Making them as capable in this model as they are in the current DTDs will take some doing, but a basic model is already presented in the proposal. Suggestions for expanding that model - from anyone - are welcome. Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Fri May 22 00:02:15 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:01:28 2004 Subject: Proposal Critique - XML DTDs to XML docs References: Message-ID: <3564A44A.FB0A686A@technologist.com> Simon St.Laurent wrote: > > There are several possible answers. You could allow XML Level 1 parsers to > ignore external entities if they choose - something similar is already in the > spec right now (in a more limited case, section 4.1) for well-formed > documents. Well, it's a mistake now, and not something I personally think should be perpetuated. Parsers should not choose what they look at or what they do not. This badly violates XML's goal of having no optional features. "External entity parsing" is an optional feature in XML. It doesn't make any sense to me that parsers should be able to decide what parts of an authors document to process, and I would not encourage you to perpetuate it into a new DTD replacement. > You could hard-wire the href attribute's interpretation in a DTD - > parsers are already dealing with references in the context of DTDs, and it > doesn't seem that hard to make sense of href. I thought that you wanted to use XLink and XPointer? > Another option is to allow the circularity. This message was brought to you > (at least partway) by the Internet Protocol, IP, defined in RFC 791. IP > includes, and indeed requires, the services of ICMP (defined in RFC 792). > ICMP uses IP to get from one place to another. Circular? Yep. Workable? > Certainly. IP isn't allowed to generate extra ICMP messages about the > delivery of an ICMP message. There is no circle in practice. Nor would there > be a circle in _practice_ by allowing the level 1 spec to refer to the hrefs > described in XLink, or to simply use href without further consideration. There is a reason that we usually choose not to have circular specifications. First, reading and writing them is often a pain. Second, the two become interdependent. As it is now, we could invent "XLink-Em" in five years, and deprecate XLink without affecting XML. This makes progress much easier. In fact, it is the primary reason that we split these things into different specifications in the first place -- so that they can grow separately. > I think you're dramatically misreading my argument, deliberately making this a > bogeyman when it isn't. I see no reason why malicious DTDs would be allowed > to 'change the parse' any more than current DTDs would be. Extensible DTDs do > _not_ mean that anything goes. Behavior can be proscribed, rules can be set. What would the rules be? What would extensions be allowed to do and not do? > A DTD in this proposal would be allowed to add things to the the parse, not > change the fundamental rules set in level 1. I guess I don't understand the difference between adding things and changing the fundamental rules of the "level 1" parse. DTDs DO change the fundamental rules of the fundamental parse. What could be more fundamental than this: ]> &foo; Now if DTD's were extensible, then I would expect to be able to do something like this: ]> &foo; And I would expect to be able to provide the behaviour for MY-ENTITY-DECLARATION (somehow). We could restrict DTD extension to data typing, but that strikes me as a step backwards. Verification is going to be (and should be) increasingly the job of non-DTD schemata. There is no good reason, in my mind, that verification of data types, or even element and attribute types, should be the responsibility of the parser. XML makes them the responsibility of the parser for historical reasons (but goes as far in separating the responsibility out as was possible). I would encourage you not to perpetuate that confusing conflation of responsibility in a DTD replacement. Verification should be handled at a different level and by a different piece of software than the parser. In other words, I think that we should be reducing the responsibilities of the DTD, rather than expanding them. A whole new syntax for a core part of the language would make XML much more complicated than it is now. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "A writer is also a citizen, a political animal, whether he likes it or not. But I do not accept that a writer has a greater obligation to society than a musician or a mason or a teacher. Everyone has a citizen's commitment." - Wole Soyinka, Africa's first Nobel Laureate xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jdg at midl.co.jp Fri May 22 00:56:44 1998 From: jdg at midl.co.jp (Joel de Guzman) Date: Mon Jun 7 17:01:28 2004 Subject: unsubscribe Message-ID: <199805212256.HAA26883@balthasar.hsnt.or.jp> unsubscribe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tommybranch at mindspring.com Fri May 22 01:02:03 1998 From: tommybranch at mindspring.com (Tommy) Date: Mon Jun 7 17:01:28 2004 Subject: unsubscribe Message-ID: <000c01bd850b$79d24600$102d56d1@ymmot> unsubscribe -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19980522/e9d9ab98/attachment.htm From SimonStL at classic.msn.com Fri May 22 02:27:18 1998 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 17:01:28 2004 Subject: Proposal Critique - XML DTDs to XML docs Message-ID: >There is a reason that we usually choose not to have circular >specifications. First, reading and writing them is often a pain. Second, >the two become interdependent. Yes, there is the 'ingrown toenail' metaphor for standards that rely on each other too closely and turn into a mess. > [re: using href declarations for references] >I thought that you wanted to use XLink and XPointer? Of course I want to use XLink and XPointer. The href declaration is the tiniest piece of the XLink standard, and seems fairly well established, if not indeed set in stone. I'd be happy to use the full XLink spec, but realize that not everyone needs it. Fine. Make href a part of the 'Level 1' spec and pray that XLink doesn't migrate to entirely different terminology. It's no worse than SYSTEM and PUBLIC are now, certainly. >What would the rules be? What would extensions be allowed to do and not >do? For now, because this is simply a 'representation', I expected the same rules to hold for these DTDs with regard to document syntax as apply now. Maybe I should have written a complete section on behavior; maybe I will. >I guess I don't understand the difference between adding things and >changing the fundamental rules of the "level 1" parse. DTDs DO change the >fundamental rules of the fundamental parse. What could be more fundamental >than this: Here we begin to see where the communications breakdown has set in, and maybe we can unravel it. You see entities as modifying the rules of the 'fundamental parse'. I see entities as riding along on the rules of the 'fundamental parse' to make their changes. To me, the basic rules for parsing establish a syntax for documents, including a set of rules for including entities. Using an entity is just taking advantage of those rules, _not_ modifying them in any way. I see the distinction between expanding an entity and including (or transcluding) information from a link as a minor technical skirmish that should have been settled long ago, not a major battle over the fundamental shape of documents. Maybe that's what I get for working in HyperCard and HTML all these years... >We could restrict DTD extension to data typing, but that strikes me as a >step backwards. Verification is going to be (and should be) increasingly >the job of non-DTD schemata. >... >Verification should be handled at a different level and by a different >piece of software than the parser. I think this philosophy reflects SGML's heritage in document management. Developers who'd like to apply XML to other tasks may find this heritage distracting or indeed disturbing, giving the DTD's current lack of extensibility. It's not hard to imagine database developers who need to use XML coming up with a really simple schema like: Then they could just use a PI to tell their application to check their well-formed document ("Who the hell needs a DTD anyway? Like who came up with _that_?") against this schema. Something like: This doesn't really do any harm; part of the joy of well-formed documents is that you can chuck all the rest of the goodies in XML and build it yourself. Still, to me, this loses a lot. I'd like to see developers use DTDs, and I think that describing the structure of these documents is important for many reasons: easier use with editors, easier-built storage systems, and, of course, error-checking. Making DTDs extensible in clearly defined ways (and not your critter) seems lke a good way to bring these folks in. By providing a structure that developers can use to ensure interoperability of their documents, as well as extend to include data-type verfication, I think we'd be able to keep more developers in the habit of using DTDs. Which brings us to the core of the issue: >In other words, I think that we should be reducing the responsibilities of >the DTD, rather than expanding them. A whole new syntax for a core part >of the language would make XML much more complicated than it is now. Right now, the options for including verification on top of the DTD structure look pretty ugly. Namespaces, schemas, and PIs pile on top of each other to drive documents into the ground. These sort of extensions are going to sprout. I'd like to give them a good place to grow, a single document that provides a complete picture of a document model's content. Do you really want stacks of schemas floating around as well as the style sheets, scripts, link group documents, and the DTD? I don't feel the need to put _everything_ in one place - style sheets, scripts, and link information seem better managed outside this framework and don't cause endless repetition of the document structure. Does it really make sense to define the DTD once for XML 1.0 validation and define an entirely separate but redundant structure for data type validation? If SGML compatibility is your highest aspiration, it certainly may. To me, it doesn't make sense. Maybe the XML-Data crew will get their ubercombination to work. I'd rather start by getting DTD's made extensible and more easily managed first, and then add the schemas later, without requiring redundant structures. This doesn't seem like that bizarre a goal. Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Fri May 22 02:46:01 1998 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 17:01:28 2004 Subject: W3C XML activities Message-ID: Would it be possible for someone involved to give a brief rundown on XML projects the W3C is actually working on? >From what I can gather, XML, XLink, XPointer, and XML Namespaces are all projects of the XML Working Group under the XML Activity. XSL appears to have its own Working Group under the Style activity. RDF appears to have its own Working Group under the Metadata activity. XML-Data, WIDL, and a number of other proposals are just that - proposals. Is this an accurate picture? Is there a roadmap on the W3C site that I simply haven't found? Thanks. Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at dvs1.informatik.tu-darmstadt.de Fri May 22 16:03:31 1998 From: rbourret at dvs1.informatik.tu-darmstadt.de (Ron Bourret) Date: Mon Jun 7 17:01:28 2004 Subject: Proposal Announcement - XML DTDs to XML docs Message-ID: <199805220931.LAA21896@berlin.dvs1.tu-darmstadt.de> Simon St. Laurent wrote: > I'd like to announce the posting of an unofficial and incomplete proposal for > the representation of XML DTDs as XML documents at: > > http://members.aol.com/simonstl/xml > I think one thing that has gotten lost in the discussion as to whether Simon's proposal is even implementable is the usefulness of it if it is. It will always be the case that more people are interested in the data in XML documents than in the DTDs. Hence, the number of tools available for exploring the data and the standards relating to these tools (e.g. SAX and DOM) will always be greater than those relating to DTDs. Simon's proposal, assuming it can be realized, erases this difference, which is exceedingly useful to those of us who want to explore DTDs. While I think something like XML-Data is needed in the long run for data type description and so on, I agree with Simon that a less ambitious subset is very useful in the short run. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at dvs1.informatik.tu-darmstadt.de Fri May 22 16:14:21 1998 From: rbourret at dvs1.informatik.tu-darmstadt.de (Ron Bourret) Date: Mon Jun 7 17:01:28 2004 Subject: parser for xml-data? Message-ID: <199805220905.LAA21599@berlin.dvs1.tu-darmstadt.de> Paul Prescod wrote: > Charles Frankston wrote: > > The ns, prefix, and src parameters to a namespace PI look a lot like > > attributes (although they are not in a formal sense). Since attributes in > > XML do not have to be in a particular order, it would certainly be > > surprising for people to discover that attributes in a namespace have to be > > a particular order. You're suggesting that the syntax be made harder to use > > in order to make the productions easier to author. I think this is a bad > > tradeoff. > > Note that the XML declaration has a required order of pseudo-attribute > occurrence. It would be best if the XML-family of language were > consistent. > > [23] XMLDecl ::= '' Chris has a point, but my intention was to make the production easier to read (benefits the user), not write (benefits the writer). I've been wading through a lot of specs recently and I tend to appreciate anything that simplifies them. Maybe I'll change my tune when I'm further along the learning curve, but right now, the added flexibility on an instruction I won't use much (even if all my XML documents use namespaces) isn't worth the added complexity. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Fri May 22 16:40:05 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:01:28 2004 Subject: Proposal Critique - XML DTDs to XML docs References: Message-ID: <35658E59.F0B90F34@technologist.com> Simon St.Laurent wrote: > > For now, because this is simply a 'representation', I expected the same rules > to hold for these DTDs with regard to document syntax as apply now. Maybe I > should have written a complete section on behavior; maybe I will. If I understand this correctly, then you are saying that at first you allow no extensions, just as DTDs allow no extensions. > Here we begin to see where the communications breakdown has set in, and maybe > we can unravel it. You see entities as modifying the rules of the 'fundamental > parse'. I see entities as riding along on the rules of the 'fundamental > parse' to make their changes. To me, the basic rules for parsing establish a > syntax for documents, including a set of rules for including entities. I misspoke. I meant that DTD's change the fundamental parse. This is not a very interesting observation: given the same document instance, and two different DTDs, you can get two radically different document parse trees. Is there any good reason that the ability to change the parse tree should be conflated with the responsibility for verifying schema-compliance as they are in DTDs. Is there any good reason to perpetuate this conflation in your proposed replacement for DTDs? > Using > an entity is just taking advantage of those rules, _not_ modifying them in any > way. I see the distinction between expanding an entity and including (or > transcluding) information from a link as a minor technical skirmish that > should have been settled long ago, not a major battle over the fundamental > shape of documents. It may or may not be a major battle over the fundamental shape of documents, but it *is* a major battle over the fundamental shape of XML. The differences between textual inclusion and structural inclusion are quite deep and subtle. They affect hyperlinking, well-formedness, validity, character set issues and almost everything else in the XML specification. At some level of abstraction the distinction may be minor, but in the details of the specification, it is humungous. > Maybe that's what I get for working in HyperCard and HTML all these years... HTML doesn't really support either. I don't know HyperCard. > >Verification should be handled at a different level and by a different > >piece of software than the parser. > > I think this philosophy reflects SGML's heritage in document management. I'm not sure if you understand that I am suggesting a model that is fundamentally different from SGML's. > Then they could just use a PI to tell their application to check their > well-formed document ("Who the hell needs a DTD anyway? Like who came up with > _that_?") against this schema. Something like: > > > > This doesn't really do any harm; part of the joy of well-formed documents is > that you can chuck all the rest of the goodies in XML and build it yourself. > > Still, to me, this loses a lot. I'd like to see developers use DTDs, and I > think that describing the structure of these documents is important for many > reasons: easier use with editors, easier-built storage systems, and, of > course, error-checking. Like anything else, I think that people should use the "standard mechanisms" when that makes sense, and avoid them otherwise. I believe that DTDs are inappropriate for some types of data, and would rather not see them used in those cases. Some data types are not supposed to be edited in editors. Sometimes the storage system and error-checking are better driven by non-DTD schema languages. > Making DTDs extensible in clearly defined ways (and not your > critter) seems lke a good way to bring these folks in. By providing a > structure that developers can use to ensure interoperability of their > documents, as well as extend to include data-type verfication, I think we'd be > able to keep more developers in the habit of using DTDs. Data type verification is only one way that DTDs fall short of some types of applications. Also, I don't believe that data type verification requires "extensible DTDs". > Does it really make sense to define the DTD once for XML 1.0 validation and > define an entirely separate but redundant structure for data type validation? > If SGML compatibility is your highest aspiration, it certainly may. To me, > it doesn't make sense. The element and attribute type verification provided by DTDs *are* data type validation. I am not arguing that data type validation should be separate from them. I am arguing that both element type validation and all other types of verification should be completely separate from issues of parsing, entity management and so forth. > Maybe the XML-Data crew will get their ubercombination to work. I'd rather > start by getting DTD's made extensible and more easily managed first, and then > add the schemas later, without requiring redundant structures. This doesn't > seem like that bizarre a goal. I don't know what you mean by "add schemas later." DTDs are schemata. That their verification responsibilities are mixed up with their parsing responsibilities is unfortunate, but over the long term, correctable. But only if we recognize that it is done in the wrong place and choose not to perpetuate it in new schema languages like the one you propose. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "A writer is also a citizen, a political animal, whether he likes it or not. But I do not accept that a writer has a greater obligation to society than a musician or a mason or a teacher. Everyone has a citizen's commitment." - Wole Soyinka, Africa's first Nobel Laureate xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Fri May 22 16:59:25 1998 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 17:01:28 2004 Subject: Proposal Announcement - XML DTDs to XML docs Message-ID: Ron Bourret wrote: >Hence, the number of tools available for exploring the data and >the standards relating to these tools (e.g. SAX and DOM) will >always be greater than those relating to DTDs. Simon's proposal, >assuming it can be realized, erases this difference, which is >exceedingly useful to those of us who want to explore DTDs. >While I think something like XML-Data is needed in the long >run for data type description and so on, I agree with Simon that >a less ambitious subset is very useful in the short run. This cuts right to the chase, making the clearest argument for a less ambitious proposal that moves XML DTDs to document syntax. I'm considering rewriting the proposal less as a concrete proposal for syntax (which just seems to generate endless arguments, and which I'm not an expert at anyway) and more as an exploration of the implications of (and possibilities for) using XML document syntax for DTDs. As an outsider, albeit one who's digging through _The SGML Handbook_, I'm fairly concerned that a lot of the specs are attempting to do too much. I'm very glad, for instance, that XML-Linking was broken down into a spec for Linking and a spec for XPointers. XML-Data seems to me to do far too much in one place, providing at once an XML syntax for DTDs and a powerful set of data schemas. While I can't see the battles directly, XML-Data seems to have a lot of people rather annoyed for a variety of widely different reasons. Still, in the long run, many projects need its capabilities. Slowing down seems like one answer to this. Build the foundation and then build the skyscraper. I'm not totally delighted with the current foundation - DTD syntax - but I can imagine it being a whole lot worse. Maybe the speed of the standards process is too slow for developers who want it all _now_, but I think we'd do well to be less ambitious but improve on the foundations we have now. Reducing the number of ways to achieve the same result is a good way to do this, as is making the foundation extensible. Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Fri May 22 17:15:35 1998 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 17:01:28 2004 Subject: Proposal Critique - XML DTDs to XML docs Message-ID: >If I understand this correctly, then you are saying that at first you >allow no extensions, just as DTDs allow no extensions. DTDs aren't allowed to change document syntax - the use of tags for elements and attributes, the use of '&' for general entities, etc. The same rules apply in this representation, as I will state more explicitly. This representation would, however, allow _additional_ rules - with data schemas the first issue to be addressed. This really isn't that difficult. >Is there any good reason that the ability to change the parse tree should >be conflated with the responsibility for verifying schema-compliance as >they are in DTDs. Is there any good reason to perpetuate this conflation >in your proposed replacement for DTDs? I'd like to see a structure that's: a) easily interpreted, edited, and stored, without the need for multiple toolsets b) capable of containing a complete set of information about a document, including structure and data What's so difficult about that? I can't think of any good reason (besides SGML compatibility) to oppose either of those goals. Why on earth would I want to keep multiple sets of document descriptions (schemas, whatever) around that share the task of defining the same document set? It seems like a management mess, a processing mess, a waste of bandwidth and storage because of redundant information, and just generally a nuisance. Making DTDs extensible is a good way, in my view, to address this issue, and several others. Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Bryan_Gilbert at pml.com Fri May 22 17:25:11 1998 From: Bryan_Gilbert at pml.com (Bryan Gilbert) Date: Mon Jun 7 17:01:28 2004 Subject: Proposal Announcement - XML DTDs to XML docs Message-ID: > -----Original Message----- > From: Simon St.Laurent [SMTP:SimonStL@classic.msn.com] > Ron Bourret wrote: > ... cut ... > >While I think something like XML-Data is needed in the long > >run for data type description and so on, I agree with Simon that > >a less ambitious subset is very useful in the short run. > > This cuts right to the chase, making the clearest argument for a less > ambitious proposal that moves XML DTDs to document syntax. I'm > considering > rewriting the proposal less as a concrete proposal for syntax (which > just > seems to generate endless arguments, and which I'm not an expert at > anyway) > and more as an exploration of the implications of (and possibilities > for) > using XML document syntax for DTDs. >... cut .. ---------------- YES! The discussion was losing direction and momentum. Both Ron's and Simon's comments are right on. People will become more familiar with XML syntax than DTD syntax and any tools that work with XML should be capable of working with the document type definition (which may be the DTD or may be the subset of XML-Data that deals with the document structure.) I hope this idea grows. What about a style sheet like transformation to/from DTD syntax to XML? (Sorry that's a naive question.) But if such a transformation could be defined then any XML tool that wished to display or work with the document type definition could do so using XML instead of DTD syntax. Bryan Gilbert email: (mailto:bryan_gilbert@pml.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mintert at irb.informatik.uni-dortmund.de Fri May 22 17:51:36 1998 From: mintert at irb.informatik.uni-dortmund.de (Stefan Mintert) Date: Mon Jun 7 17:01:28 2004 Subject: Q: SW for parsing DTD's Message-ID: <199805221551.RAA01630@brown.informatik.uni-dortmund.de> Hi! For the project I'm currently working on, I need to parse DTD's and query information like this: - which elements are valid in a given context? - which elements can contain an element of a certain type? Now I'm looking for software that does the parsing for me (Java classe are preferred) and provides a high-level interface. The best I've found is IBM's "XML for Java". Everything I need is in the class com.ibm.xml.parser.DTD. But unfortunately the license agreement ends 90 days after downloading :-( Since I'm writing a program for my thesis it must run for more than 90 days ;-) Do you know of any software that has similar functions as XML4J and that is free for unlimited (non-profit) use? Thanks. Bye, Stefan. PS: I've already checked the docs of �lfred and XP, but that isn't exactly what I'm looking for. XP looks fine if I use the low-level interface and build on top of it; but that's still some work to do... +-----------------------------------------------------------+ Stefan Mintert UniDo: mintert@irb.informatik.uni-dortmund.de private: stefan@mintert.com WWW: http://www.informatik.uni-dortmund.de/~sm/ +-----------------------------------------------------------+ "let the music keep our spirits high..." (Jackson Browne) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Fri May 22 17:56:42 1998 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 17:01:29 2004 Subject: Proposal Announcement - XML DTDs to XML docs Message-ID: >YES! The discussion was losing direction and momentum. As a significant participant, I have to agree with you there, and I'm glad to see a spark is returning. >What about a style sheet like transformation to/from DTD syntax to >XML? (Sorry that's a naive question.) But if such a transformation >could be defined then any XML tool that wished to display >or work with the document type definition could do so >using XML instead of DTD syntax. That's actually what I had in mind when I first started writing this proposal, which explains to some extent why I didn't go into detail explaining behavior. I've heard of a few SGML tools that do this - Marcus Carr noted that OmniMark includes something called DTD2DTD. I think an equivalent tool for making such transformations in XML would be a great start. Of course, we'd have to figure out what that transformation would look like, but that's all part of the fun. >I hope this idea grows. Glad to hear it! I certainly hope it grows, whether or not it has anything to do with this particular proposal. Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at dvs1.informatik.tu-darmstadt.de Fri May 22 18:58:13 1998 From: rbourret at dvs1.informatik.tu-darmstadt.de (Ron Bourret) Date: Mon Jun 7 17:01:29 2004 Subject: Proposal Announcement - XML DTDs to XML docs Message-ID: <199805220945.LAA21915@berlin.dvs1.tu-darmstadt.de> A couple of comments about the example on your Web page: IMAGE,CAPTION? CDATA #IMPLIED Content model should contain sub-elements, such as , not text. You don't want to force applications to parse text. On the other hand, attribute descriptions are probably better stored in attributes: The reason is that the possible choices are limited and work very well as enumerated attributes. Note that this is what XML-Data does. If you are defining some sort of XML-Data-Lite, XML-Data is probably a pretty good starting place. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mcc at arbortext.com Fri May 22 19:03:31 1998 From: mcc at arbortext.com (Mike Champion) Date: Mon Jun 7 17:01:29 2004 Subject: Q: SW for parsing DTD's In-Reply-To: <199805221551.RAA01630@brown.informatik.uni-dortmund.de> Message-ID: <98May22.125851edt.26881@thicket.arbortext.com> At 11:51 AM 5/22/98 -0400, Stefan Mintert wrote: >For the project I'm currently working on, I need to parse DTD's and query >information like this: > >- which elements are valid in a given context? >- which elements can contain an element of a certain type? > >Do you know of any software that has similar functions as XML4J and that is >free for unlimited (non-profit) use? See http://www.docuverse.com/personal/freedom/index.html I don't have personal experience with this package, but from xml-dev and www-dom postings, I know that FREE-DOM uses the SAX interface to support any of several Java XML parsers, and produces a set of data structures that can be accessed with the DOM APIs. So in principle, you should be able to do more or less what you've done with XML4JAVA with FREE-DOM. I'm also excited to hear about your project because the DOM WG and mailing lists have discussed whether/when the DOM should support XML validation APIs. I've thought that this would be a good APPLICATION for the DOM, e.g., a XMLValidate JavaBean that uses the DOM to access the DTD information and the document structure in order to answer the kinds of questions you pose, and more generally, to allow the kinds of dynamic validation that XML editors will need to do. Good luck, Mike Champion xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Fri May 22 20:14:56 1998 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 17:01:29 2004 Subject: Proposal Announcement - XML DTDs to XML docs Message-ID: Ron Bourret wrote: >A couple of comments about the example on your Web page: Okay, I admit it. I wrote the examples at the beach. I tend to be the kind of person who prefers to see content stored as text - empty elements as a way of life doesn't fit the way I think about data. But for the most part, you're right, and thanks for the suggestion. I had a difficult time deciding whether to leave the examples in the proposal, because I'm quite aware that the folks working on XML-Data have a lot more experience than I do. I left them in there primarily to make it clear to readers familiar with XML-Data that this proposal was _not_ XML-Data, hoping that they would see that it comes from a different orientation (at least as far as I can gather from the XML-Data proposal.) The syntax is less important to me than the shift to a document format. In the next version, the examples will likely disappear, at least for now, as the document shifts from being a proposal to an examination of the implications of such a model. All suggestions for the next version are still welcome - anyone wanting to participate in the development of this proposal is welcome to contact me. Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Fri May 22 20:43:50 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:01:29 2004 Subject: Proposal Critique - XML DTDs to XML docs References: Message-ID: <3565C7A5.9BBE99CA@technologist.com> Simon St.Laurent wrote: > > >Is there any good reason that the ability to change the parse tree should > >be conflated with the responsibility for verifying schema-compliance as > >they are in DTDs. Is there any good reason to perpetuate this conflation > >in your proposed replacement for DTDs? > > I'd like to see a structure that's: > a) easily interpreted, edited, and stored, without the need for multiple > toolsets > b) capable of containing a complete set of information about a document, > including structure and data The word "structure" is too vague for me to be able to argue for or against. Are you talking about a single *language* (or specification) that incorporates a) instance syntax b) textual replacement c) external text embedding d) extensible validation XML 1.0 incorporates all of them. I think that that made sense for XML 1.0, in order to be SGML compatible, but for future versions I would rather see the first three completely separate from the fourth. The reason I feel that the last should be separated is that the types of validation (or "verification") that people have to do can be quite varied. XML made the DTD optional for this reason. I don't see that making the XML specification substantially larger with an alternative encoding for DTDs can really make that specification simpler. > Why on earth would I > want to keep multiple sets of document descriptions (schemas, whatever) around > that share the task of defining the same document set? It seems like a > management mess, a processing mess, a waste of bandwidth and storage because > of redundant information, and just generally a nuisance. > > Making DTDs extensible is a good way, in my view, to address this issue, and > several others. That sounds attractive, and I encourage you to try and make it work. If you succeed, I will be happy to use it. But, to be honest, I don't think it will succeed. It's like in the early days of computer programming when people thought that it was possible to invent a single, "extensible" programming language (or "meta programming language") that would serve all needs. Every attempt didn't quite do everything that everybody needed, and the harder people worked to make languages "extensible", the more complex (C++) or merely unpopular (Lisp) the language became. I personally don't believe that one extensible schema/DTD language can serve all of our diverse validation needs. The set of "extensions" will be unlimited and approach the complexity of a full programming language. Look at RDF schemata. They are miles and miles away from DTDs. I've had document types where I was modeling OO systems and wanted to verify things like "base class is not inherited more than once." Some OO-modeling schema language would handle that, but DTDs (even extensible ones) could never do so. I tend to think that a strategy that is more likely to be successful is one that layers schema languages. At the bottom level you have something like XML DTDs without all of the stuff related to entities and notations (in XML element notation). That layer might include data type validation. In levels above that you have RDF and other schemata that are more interested in relationships than in positional occurrence. It seems like you are interested in that bottom layer schema. I think that it would be good to formalize an XML element notation for the bottom layer. But if you try to make it a replacement for DTDs, then it must do everything that DTDs do and inherit all of the problems that the conflation of features in DTDs causes. > What's so difficult about that? I can't think of any good reason (besides > SGML compatibility) to oppose either of those goals. It is quite likely that SGML will soon be changed to allow you to use whatever notation you want for XML DTDs. SGML compatibility is not a problem. The question is what is the right design. You can make a slightly better version of a bad design, or you can try to start again with a good design. Let me ask this plainly: Does it make sense a) that textual substitutions should be specified in a part of a document called a "document type definition". b) that the "document type definition" should also be responsible for declaring media types and attaching them to non-XML entities. c) that the language for verifying element and attribute occurrence must be in the same specification (XML 1.x) as that for creating elements and attributes themselves? I don't think that those three things (among others) make sense anymore. Hence, I don't think that inventing a new notation for this inappropriate concept is a good idea. If we are to replace DTDs, let us replace them with something simpler and more specific to the task of validation, instead of transliterating them into another syntax, warts and all. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "A writer is also a citizen, a political animal, whether he likes it or not. But I do not accept that a writer has a greater obligation to society than a musician or a mason or a teacher. Everyone has a citizen's commitment." - Wole Soyinka, Africa's first Nobel Laureate xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Fri May 22 20:58:28 1998 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:01:29 2004 Subject: Proposal Critique - XML DTDs to XML docs Message-ID: <01bd85b4$43645be0$LocalHost@uspppBckman> >>Every attempt didn't quite do everything that everybody needed, and >>the harder people worked to make languages "extensible", the more complex >>(C++) or merely unpopular (Lisp) the language became. I think this is a good point and an excellent argument for Simon's proposal. This language will be a very useful tool for a subset of users who dont want the hassel of learning a DTD language. There are always going to be things that a DTD is necessary for, but that doesnt mean that this is the only tool one should use. On the other hand if a language tries to be something to every one it is in danger of becoming like a swiss army knife. It will perhaps do the job, but not as well as a specialized tool. Frank -----Original Message----- From: Paul Prescod To: Xml-Dev (E-mail) Date: Friday, May 22, 1998 2:46 PM Subject: Re: Proposal Critique - XML DTDs to XML docs >Simon St.Laurent wrote: >> >> >Is there any good reason that the ability to change the parse tree should >> >be conflated with the responsibility for verifying schema-compliance as >> >they are in DTDs. Is there any good reason to perpetuate this conflation >> >in your proposed replacement for DTDs? >> >> I'd like to see a structure that's: >> a) easily interpreted, edited, and stored, without the need for multiple >> toolsets >> b) capable of containing a complete set of information about a document, >> including structure and data > >The word "structure" is too vague for me to be able to argue for or >against. Are you talking about a single *language* (or specification) that >incorporates > > a) instance syntax > b) textual replacement > c) external text embedding > d) extensible validation > >XML 1.0 incorporates all of them. I think that that made sense for XML >1.0, in order to be SGML compatible, but for future versions I would >rather see the first three completely separate from the fourth. The reason >I feel that the last should be separated is that the types of validation >(or "verification") that people have to do can be quite varied. XML made >the DTD optional for this reason. I don't see that making the XML >specification substantially larger with an alternative encoding for DTDs >can really make that specification simpler. > >> Why on earth would I >> want to keep multiple sets of document descriptions (schemas, whatever) around >> that share the task of defining the same document set? It seems like a >> management mess, a processing mess, a waste of bandwidth and storage because >> of redundant information, and just generally a nuisance. >> >> Making DTDs extensible is a good way, in my view, to address this issue, and >> several others. > >That sounds attractive, and I encourage you to try and make it work. If >you succeed, I will be happy to use it. But, to be honest, I don't think >it will succeed. It's like in the early days of computer programming when >people thought that it was possible to invent a single, "extensible" >programming language (or "meta programming language") that would serve all >needs. Every attempt didn't quite do everything that everybody needed, and >the harder people worked to make languages "extensible", the more complex >(C++) or merely unpopular (Lisp) the language became. > >I personally don't believe that one extensible schema/DTD language can >serve all of our diverse validation needs. The set of "extensions" will be >unlimited and approach the complexity of a full programming language. Look >at RDF schemata. They are miles and miles away from DTDs. I've had >document types where I was modeling OO systems and wanted to verify things >like "base class is not inherited more than once." Some OO-modeling schema >language would handle that, but DTDs (even extensible ones) could never do >so. > >I tend to think that a strategy that is more likely to be successful is >one that layers schema languages. At the bottom level you have something >like XML DTDs without all of the stuff related to entities and notations >(in XML element notation). That layer might include data type validation. >In levels above that you have RDF and other schemata that are more >interested in relationships than in positional occurrence. > >It seems like you are interested in that bottom layer schema. I think that >it would be good to formalize an XML element notation for the bottom >layer. But if you try to make it a replacement for DTDs, then it must do >everything that DTDs do and inherit all of the problems that the >conflation of features in DTDs causes. > >> What's so difficult about that? I can't think of any good reason (besides >> SGML compatibility) to oppose either of those goals. > >It is quite likely that SGML will soon be changed to allow you to use >whatever notation you want for XML DTDs. SGML compatibility is not a >problem. The question is what is the right design. You can make a slightly >better version of a bad design, or you can try to start again with a good >design. > >Let me ask this plainly: > >Does it make sense > >a) that textual substitutions should be specified in a part of a document >called a "document type definition". > >b) that the "document type definition" should also be responsible for >declaring media types and attaching them to non-XML entities. > >c) that the language for verifying element and attribute occurrence must >be in the same specification (XML 1.x) as that for creating elements and >attributes themselves? > >I don't think that those three things (among others) make sense anymore. >Hence, I don't think that inventing a new notation for this inappropriate >concept is a good idea. If we are to replace DTDs, let us replace them >with something simpler and more specific to the task of validation, >instead of transliterating them into another syntax, warts and all. > >Paul Prescod - http://itrc.uwaterloo.ca/~papresco > >"A writer is also a citizen, a political animal, whether he likes it or >not. But I do not accept that a writer has a greater obligation >to society than a musician or a mason or a teacher. Everyone has >a citizen's commitment." - Wole Soyinka, Africa's first Nobel Laureate > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Fri May 22 22:51:34 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:01:29 2004 Subject: Q: SW for parsing DTD's Message-ID: <005701bd85c2$53c43880$2ee044c6@arcot-main> >>For the project I'm currently working on, I need to parse DTD's and query >>information like this: >> >>- which elements are valid in a given context? >>- which elements can contain an element of a certain type? >> >>Do you know of any software that has similar functions as XML4J and that is >>free for unlimited (non-profit) use? > >See http://www.docuverse.com/personal/freedom/index.html I am afraid that Free-DOM currently does not provide access to DTD information primarily because SAX currently does not. DTD support will be provided in the near future (< 2 months). Don Park http://www.docuverse.com/personal/index.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri May 22 23:01:47 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:01:29 2004 Subject: Comments, parsers, XPointers In-Reply-To: <198001011834.NAA00512@unready.megginson.com> References: <3.0.3.32.19980520124217.00bd928c@nexus.polaris.net> <3.0.3.32.19980520124217.00bd928c@nexus.polaris.net> Message-ID: <3.0.1.16.19980522215632.3f3f7e10@pop3.demon.co.uk> At 13:34 01/01/80 -0500, David Megginson wrote: > >In the longer term, what we need is an official definition of an XML >information set, specifying (for example) that reporting comments is >optional, while reporting the start and end of elements is required. >Once such a beastie exists, many vexing questions about (and >inconsistencies among) the DOM, XPointers, SAX, XSL, etc. will >disappear. > Agreed. But I think we need it in the shorter term :-) Over the last day Eliot Kimber has spent a lot of time (thanks :-) helping to clear up some of my ideas about addressing and linking. (I have been much concerned about how to build software that is unambiguous). I believe I'm quoting Eliot correctly in saying that it is impossible to implement Xpointer rigorously until we agree on the abstract model of an XML document. I suspect we don't, all, yet... P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri May 22 23:04:15 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:01:29 2004 Subject: LISTRIVIA (was Re: Empty End Tags Considered Confusing) In-Reply-To: <199805191516.LAA07380@ruby.ora.com> Message-ID: <3.0.1.16.19980522215452.34af4846@pop3.demon.co.uk> At 11:16 19/05/98 -0400, Chris Maden wrote: >I think it was xml-dev that was discussing this; the archive is down >(Henry, are you listening?). I have been away in Paris...so I haven't made contact recently with Henry. Do you mean the hypermail server on which the HTML resides is down? or that the hypermailer isn't adding to the archive? In any case poor Henry can't do anything himself as he doesn't actually run the machine - even though he looks after the large amount of work the list generates. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri May 22 23:06:48 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:01:29 2004 Subject: W3C XML activities In-Reply-To: Message-ID: <3.0.1.16.19980522215513.328f4328@pop3.demon.co.uk> At 00:45 22/05/98 UT, Simon St.Laurent wrote: >Would it be possible for someone involved to give a brief rundown on XML >projects the W3C is actually working on? > >From what I can gather, XML, XLink, XPointer, and XML Namespaces are all >projects of the XML Working Group under the XML Activity. XSL appears to have >its own Working Group under the Style activity. RDF appears to have its own >Working Group under the Metadata activity. XML-Data, WIDL, and a number of >other proposals are just that - proposals. > >Is this an accurate picture? Is there a roadmap on the W3C site that I simply >haven't found? > >Thanks. I think this would be very valuable. I have just come back from Paris where John Bosak gave exactly such a report. I intend to comment on some of Paris, but specifically excluded Jon's report in case I got something wrong. As always it was very carefully presented. There were no surprises to those who talk regularly about these things but it formed up a good deal I was unclear about. I know Jon is very busy - I don't know whether others are prepared to summarise? P. > >Simon St.Laurent >Dynamic HTML: A Primer / XML: A Primer / Cookies > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Fri May 22 23:26:43 1998 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 17:01:30 2004 Subject: Proposal Critique - XML DTDs to XML docs Message-ID: I thought I was arguing with a conservative, but now I see that you're a radical! Very impressive, and I think we're reaching some points where we (at last) both agree. > a) instance syntax > b) textual replacement > c) external text embedding > d) extensible validation > >XML 1.0 incorporates all of them. I think that that made sense for XML >1.0, in order to be SGML compatible, but for future versions I would >rather see the first three completely separate from the fourth. The reason >I feel that the last should be separated is that the types of validation >(or "verification") that people have to do can be quite varied. Actually, I'd like to see all four of these separated, except for b and c, which I feel should at least use common syntax. a is the basic XML document syntax, b and c provide entity and linking-type services, and d is today's point of contention. Still, this breakdown of the standard is a very good start. >XML made >the DTD optional for this reason. I don't see that making the XML >specification substantially larger with an alternative encoding for DTDs >can really make that specification simpler. You're right here as well. I'd recommend disconnecting d, the validation, from the core standard (which I really think is just a). Making validation a separate standard would open the way a re-examination of validation that wasn't deeply intertwined with existing syntax, and could well lead the way to a syntax better than either the current standard or the XML document syntax. At least it would keep the core standard from growing warts. >I tend to think that a strategy that is more likely to be successful is >one that layers schema languages. At the bottom level you have something >like XML DTDs without all of the stuff related to entities and notations >(in XML element notation). That layer might include data type validation. >In levels above that you have RDF and other schemata that are more >interested in relationships than in positional occurrence. I'm very cautious right now about adding too many layers of schemas, mostly because they seem to require very high levels of redundancy. (XML-Data, a DTD, and an RDF representation of a single document? And they all have to stay consistent? Whoa...) Still, a layered model like this might make sense in the long run, provided that we can stay away from truckloads of redundant information. >But if you try to make it a replacement for DTDs, then it must do >everything that DTDs do and inherit all of the problems that the >conflation of features in DTDs causes. >... >The question is what is the right design. You can make a slightly >better version of a bad design, or you can try to start again with a good >design. I fear this is a symptom of the conservatism of the proposal, which began merely as an effort to map DTD syntax to an XML document syntax. It inherits the warts along with the beauty. A more radical solution would excise many of the warts, but didn't seem like a wise idea given the conservatism of many in the XML community. Smaller steps on the way to a radical change seemed more appropriate in this environment, but that may not be the best way to go. >Does it make sense > >a) that textual substitutions should be specified in a part of a document >called a "document type definition". No. Entities were what bothered me most about XML DTDs in the first place. I'd move them elsewhere quite happily. >b) that the "document type definition" should also be responsible for >declaring media types and attaching them to non-XML entities. No. This seemed like a significant departure from current practice on the Web (which I support strongly) - MIME types. >c) that the language for verifying element and attribute occurrence must >be in the same specification (XML 1.x) as that for creating elements and >attributes themselves? Must be? No. Could be, and would offer significant advantages? Yes. I still think XML document syntax has significant advantages over current syntax. Could there be a better syntax? Of course. >If we are to replace DTDs, let us replace them >with something simpler and more specific to the task of validation, >instead of transliterating them into another syntax, warts and all. Good idea. Who, where, when? I'll buy beer... I think I'll continue the proposal as an implications document, while looking out for something more radical. In the meantime, I've got a lot of XML to build! Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sat May 23 06:53:52 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:01:30 2004 Subject: Comments, parsers, XPointers References: <3.0.3.32.19980520124217.00bd928c@nexus.polaris.net> <3.0.3.32.19980520124217.00bd928c@nexus.polaris.net> <3.0.1.16.19980522215632.3f3f7e10@pop3.demon.co.uk> Message-ID: <356656A8.DABB51C3@technologist.com> Peter Murray-Rust wrote: > > Agreed. But I think we need it in the shorter term :-) Over the last day > Eliot Kimber has spent a lot of time (thanks :-) helping to clear up some > of my ideas about addressing and linking. (I have been much concerned about > how to build software that is unambiguous). I believe I'm quoting Eliot > correctly in saying that it is impossible to implement Xpointer rigorously > until we agree on the abstract model of an XML document. I suspect we > don't, all, yet... I think that Eliot understands this better than most people, and it's a lesson I'm glad I didn't have to learn for myself. HyTime was mostly rewritten from scratch between versions 1 and 2 because without the formalism underlying the markup (the "low-level semantics"), the first version was built on a base of sand. By the time DSSSL and HyTime 2 came around, the sand was starting to crumble, and the grove is the concrete basis that the new family of standards used. I really hope that we can get concensus on this issue so that the W3C will move quickly on it, before the Web standards follow the same path and must all be rewritten (and unified) for version 2.0. I personally believe that this should be a higher priority than XSL, the DOM or anything else. What's the use in rushing to build an apartment subdivision before laying the foundation? Paul Prescod - http://itrc.uwaterloo.ca/~papresco "A writer is also a citizen, a political animal, whether he likes it or not. But I do not accept that a writer has a greater obligation to society than a musician or a mason or a teacher. Everyone has a citizen's commitment." - Wole Soyinka, Africa's first Nobel Laureate xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sat May 23 07:01:20 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:01:30 2004 Subject: Proposal Critique - XML DTDs to XML docs References: Message-ID: <35665863.ED94D88B@technologist.com> Simon St.Laurent wrote: > > Actually, I'd like to see all four of these separated, except for b and c, > which I feel should at least use common syntax. Well, I'm not that much of a radical yet. I am not yet convinced that structural inclusion (e.g. get that tree from that document and place it *here*) can completely and conveniently replace XML/SGML's text substitution model. I'm ready to be convinced, however. Once XLink is done and widely implemented, we will have an opportunity to pit the two methods head to head against each other, and the market will decide. It's because the jury is still out on that issue that I argue(d) for entities etc. in XML. I wasn't willing to bet XML's usability on the faith that XLink would come along and would do everything that entities do. Presumably the people who could actually vote felt that way also. > I'm very cautious right now about adding too many layers of schemas, mostly > because they seem to require very high levels of redundancy. (XML-Data, a > DTD, and an RDF representation of a single document? And they all have to > stay consistent? Whoa...) Well, XML allows you to skip the DTD. XML-Data is supposed to verify everything the DTD does, so you can skip the verification (schema) parts of the DTD if you use XML-Data. You may still need the DTD for entities and notations. (XML-Data does allow you to declare entities, which makes no sense to me, so I'll just ignore it and hope it goes away) I believe that by the time RDF and XML-Data are done, there should be little overlap between them. RDF schemata constrain relationships between elements of particular types. XML-Data constrains where they can occur. So it there is still a question about managing multiple files and layers, but should NOT be a question of duplication of services. I try to think of the situation as analogous to stylesheets. You probaby wouldn't have one RDF schema for every XML Data schema you have, and one XML Data schema for every document you have. (you can't help but have one DTD per document...that's the way XML is defined) Instead, you would use a single XML Data schema for dozens of documents, and perhaps a single RDF schema for dozens of document *types*. There is some conceptual overlap here with architectural forms, as they are also meant to be layered in the way I describe. But archforms allow multiple layers of purely positional verification. You need some other kind of schema language to do link/relationship verification. > I fear this is a symptom of the conservatism of the proposal, which began > merely as an effort to map DTD syntax to an XML document syntax. It inherits > the warts along with the beauty. A more radical solution would excise many of > the warts, but didn't seem like a wise idea given the conservatism of many in > the XML community. Smaller steps on the way to a radical change seemed more > appropriate in this environment, but that may not be the best way to go. Well, there are no more conservatives anymore. ISO seems willing to go far beyond what you or I am proposing. As I understand it, they are moving to a situation where DTDs can be in *any notation* whatsoever. I could understand it wrong, because I was not at the meeting, but as I understand it, you could invent a new binary notation for DTDs, and it could turn on tag ommission and shortref features that would make your SGML documents unreadable to anyone without a parser for your binary DTD notation. If you approach it the way I am suggesting, then that isn't a problem. If schemata are separated from syntax, then Microsoft may not be able to validate documents created with Netscape's editor, but at least they will be able to *read* them. Having a proprietary schema language/engine would be just like having a proprietary style language/engine -- not ideal, but sometimes necessary. I should mention that I have mostly cribbed this "ignore DTDs and work directly on schemata" approach from various people. The only one whose name I can think of right now is Eliot Kimber, who wants to move away from DTDs for different reasons than you do (syntax) or I do (conflation of features). He dislikes the fact that they are controlled by, and can be overridden by, the document. Although this also bothers me, it bothers him a lot more. See: http://www.sil.org/sgml/n1957Note.html Of course the non-DTD schemata that Eliot is most interested in are architectures, which look a lot like DTDs. Also, http://www.sil.org/sgml/thompsonSchemata.html is useful. Henry uses the phrase "document structure definition" to avoid getting stuck in the "DTD" rut. > >c) that the language for verifying element and attribute occurrence must > >be in the same specification (XML 1.x) as that for creating elements and > >attributes themselves? > > Must be? No. Could be, and would offer significant advantages? Yes. I > still think XML document syntax has significant advantages over current > syntax. Could there be a better syntax? Of course. I wasn't asking about syntax, but about actual specifications. Should the validation language be specified in the same standards document as the language syntax? I think that we agree that it should not. > >If we are to replace DTDs, let us replace them > >with something simpler and more specific to the task of validation, > >instead of transliterating them into another syntax, warts and all. > > Good idea. Who, where, when? I'll buy beer... Well, I think that this is what XML-Data is about, but it is only a rough sketch. I also think that the W3C is supposed to create a working group that will address these sorts of issues. We could work out a concrete proposal in this mailing list, or offline, but I'm not sure if it would move us beyond all of the other DTDs for DTDs. In other words, I think that your basic idea will eventually get implemented, hopefully as modified according to my comments. I don't know how to expedite that, however. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "A writer is also a citizen, a political animal, whether he likes it or not. But I do not accept that a writer has a greater obligation to society than a musician or a mason or a teacher. Everyone has a citizen's commitment." - Wole Soyinka, Africa's first Nobel Laureate xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sat May 23 07:13:24 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:01:30 2004 Subject: Q: SW for parsing DTD's References: <005701bd85c2$53c43880$2ee044c6@arcot-main> Message-ID: <35665B36.AFC3BD9A@technologist.com> Don Park wrote: > > I am afraid that Free-DOM currently does not provide access to DTD > information primarily because SAX currently does not. DTD support will be > provided in the near future (< 2 months). I would like to take this opportunity to stress the point that there are two ways of unifying DTD and XML document instance processing (or *any* processing) at the syntactic level and at the API level. The SGML grove formalism does the latter: given a grove with information about both the DTD and the instance (such as that provided by a version of Jade that should be coming out early next month), you can use the same functions or methods to navigate that information. Free-DOM will also provide this service in a couple of months. I believe that Jumbo already converts DTDs into something XML instance-ish internally. If every tool did this, then it wouldn't matter whether DTDs were in instance syntax or DTD syntax. Yes, I recognize that this means that every tool should have an "extra" parser and grove/DOM builder. Yes, there is a room for argument as to whether that is an acceptable price to pay for DTD's compact and distinctive syntax. But the important thing to recognize is that this is an option. If your grove/DOM API server doesn't provide access to DTDs and instances through the same API, then its creator can correct that in that API server without the W3C changing a single byte of the XML specification. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "A writer is also a citizen, a political animal, whether he likes it or not. But I do not accept that a writer has a greater obligation to society than a musician or a mason or a teacher. Everyone has a citizen's commitment." - Wole Soyinka, Africa's first Nobel Laureate xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Sat May 23 15:40:24 1998 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 17:01:30 2004 Subject: Proposal Critique - XML DTDs to XML docs Message-ID: >I am not yet convinced that >structural inclusion (e.g. get that tree from that document and place it >*here*) can completely and conveniently replace XML/SGML's text >substitution model. I'm ready to be convinced, however. Once XLink is done >and widely implemented, we will have an opportunity to pit the two methods >head to head against each other, and the market will decide. I think letting the market decide here is a wise idea, but I remain concerned that XLink hasn't yet stepped up to the challenge. Given the variety of interpretations surrounding the behavior of EMBED, and the number of people who told me last time I asked that EMBED _wasn't_ about text substitution, I'm not sure the market will get a chance to decide. I hope this is made explicit by the time the standard reaches stability. >I believe that by the time RDF and XML-Data are done, there should be >little overlap between them. RDF schemata constrain relationships between >elements of particular types. XML-Data constrains where they can occur. So >it there is still a question about managing multiple files and layers, but >should NOT be a question of duplication of services. > >I try to think of the situation as analogous to stylesheets. I must admit I remain skeptical; RDF and XML-Data seem to want to do too much of the same thing at this point. It doesn't help that they look radically different. I don't mind telling people to use one or the other, but using both seems like a lot. Again, I hope this is cleared up by the time the standards reach stability. The analogy to stylesheets unfortunately makes me wonder if we're going to see documents using both CSS and XSL, leaving applications to puzzle out which to use and how/if they should interact. A uniform standard defining how resources (schemas, DTDs, stylesheets, extended link documents, etc.) should be linked to documents and with what priority would go a long way toward easing my concerns. The current soup of PIs, DOCTYPE, and XLink elements is messy at best. >Well, there are no more conservatives anymore. ISO seems willing to go far >beyond what you or I am proposing. As I understand it, they are moving to >a situation where DTDs can be in *any notation* whatsoever. Looks like I'm turning into the conservative. *Any notation* sounds like an expansion into the world of 'how many options can you conceivably overload this system with?' One of the most important things I liked about XML was its insistence on single mechanisms with no options. It might make sense to break validation and schema issues into a 'family' of standards, connected by the uniform linking standard mentioned above, but at this point I think we have plenty of chaos. >I wasn't asking about syntax, but about actual specifications. Should the >validation language be specified in the same standards document as the >language syntax? I think that we agree that it should not. Completely right. This opens the way to a family of standards and hopefully will reduce the number of bullets whizzing by. >Well, I think that this is what XML-Data is about, but it is only a rough >sketch. I also think that the W3C is supposed to create a working group >that will address these sorts of issues. We could work out a concrete >proposal in this mailing list, or offline I'd like to see that rough sketch grow into a usable set of standards. I wrote the 'Representation' proposal to get some public discussion started, and that seems to have succeeded. I'm hoping this weekend to modify the proposal, making clear that the syntax presented is only for illustration of implications. Perhaps it can serve as one of many springboards for this discussion in the more formal standards process, which I'll trust to work out the concrete proposal. Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer / Cookies xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sun May 24 00:40:01 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:01:30 2004 Subject: Proposed process for DTDs in XML (was Re: Q: SW for parsing DTD's, etc.) In-Reply-To: <35665B36.AFC3BD9A@technologist.com> References: <005701bd85c2$53c43880$2ee044c6@arcot-main> Message-ID: <3.0.1.16.19980523224835.0d570982@pop3.demon.co.uk> At 01:14 23/05/98 -0400, Paul Prescod wrote: >service in a couple of months. I believe that Jumbo already converts DTDs >into something XML instance-ish internally. Yes. And it's because I'd like this to be *syntactically* compatible with any other similar tool that I'm keen on exploring this idea. > >If every tool did this, then it wouldn't matter whether DTDs were in >instance syntax or DTD syntax. Yes, I recognize that this means that every Agreed. BUT we would have to agree on a set of elementTypes (DTD-speak) or property-set (Grove-speak). As a very simple example, do we use 'ATTRIBUTE' or 'Attribute' or AttName or whatever. Agreement here would go a long way towards interoperability. I have raised this idea periodically and SimonStL has persevered over the last 2 weeks and my feeling that there is a critical mass of people who would like to see if something could be formalised out of this. I encouraged Simon to keep posting so take responsibility for the continued discussion. This posting includes a proposal as to how we go forward. NOTE: Objections have been raised on the basis that: - such a proposal is impossible and we are bound to fail. In my mind the operations we are prosing are simply syntactic transformations with an agreed vocabulary and therefore almost trivial and automatable. I think that these objectors think we propose something far more ambitious that we actually do. - the proposal is not worthwhile, because the DOM/WG/XML-data/RDF/etc. are working on this and we are simply duplicating work that they are/will_be doing. This is a potentially valid objection, but I suspect that our effort will be valuable in any case. Even if later subsumed by other efforts, experience gained will be valuable in and may help those efforts. - the proposal is irresponsible because it will encourage people to do things they didn't ought to be doing. By creating a DTD syntax that is potentially extensible, it actually will be extended. I think that this community has shown itself responsible, and I shall suggest that our proposal is outlined in such a way as to encourage responsibility. Motivation ---------- There seems to be a feeling that the current DTD syntax does not meet a number of needs. *** In all discussion that follows I am NOT suggesting that XML DTD syntax should be replaced *** . I hope that the suggestion will enhance it. Current limitations seem to be: 1. There is no mechanism for adding human-readable semantics to a DTD. The point has been made strongly on XML-DEV that DTDs must be documented, but the method of documentation is undefined. Even in a simple example like: it is impossible to know whether the comment is associated with the element, the attribute, both or neither. [I am in the last stages of releasing the next VHG DTD and feel very strongly the lack of a mechanism for documenting it.] This problem alone is enough to convince me that a mechanism would be desirable. We all agree that a DTD per se cannot carry semantic information but there is an urgent need to be able to associate semantic information with a DTD. 2. There is no mechanism for associating machine-readable semantics with a DTD. This is also a serious problem for me. I need to be able to link elements and attributes to behavior (at present through Java classes). I am not suggesting that we develop universal mechanisms for doing this, but that we choose a syntax which allows it. 3. There are no defined tools or other processes for analysing a DTD in XML format. The attraction of a DTD-in-XML is that we can use the very large number of XML tools for manipulating, filtering, rendering, etc. For example, JUMBO1 can create a tree from the DTD and can therefore express it as a JUMBO object just like a document. XSL and CSS could apply to DTDs as well as documents. Help could be created from DTDs, etc. It is possible that the XML property set (if it is anywhere defined) might meet part of this need. If so, and if it can be expressed in a tree structure, then it could be isomorphous with an XML representation of a DTD. 4. There is a need for an additional level of semantic validation using concepts not expressible in a DTD [cardinality, data typing, etc.] I think this is one of the more sensitive areas of this proposal because it encourages the creations of schemas as opposed to DTDs. Proposal -------- I think we can proceed by the methodology that David Megginson developed so successfully for SAX. The Megginsonic 'dialogue' consists of posting succinct questions and gathering feedback - as a result of answers gained a new round of questions are proposed. I personally do not intend to play this central role although I am happy to try to provide a subsidiary role as before. It seems that we: - need to agree on goals (and especially to limit their scope) - define the limits of use of the resulting document. - define timescales. For myself I would suggest the following criteria and hope that others would be added: - the DTDXML DTD should be algorithmically derivable from the XML 1.0 spec - a DTDXML for a given DTD should represent the DTD after normalisation (i.e. no support for PEs and other lexical operations). It should correspond to information potentially available after parsing and should correspond roughly to the goals of SAX (i.e. be simple and not try to represent everything in a DTD) - the DTDXML DTD must support Help and documentation. Individual attributes should be documentable. - a DTDXML should be internally addressable by Xlink (unlike DTDs). The granularity should allow addressing of attributes and elements. - the DTDXML should NOT devise methods for extending the range of semantic validation. - from a DTDXML it should be possible to recreate the corresponding (normalised) DTD without loss. I do not know whether Simon can take on the central role himself or whether volunteers are needed. If he were to devise some procedural questions we could get a feel of whether and how the process could proceed further. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sun May 24 00:40:05 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:01:30 2004 Subject: SGML/XML 98 Paris Message-ID: <3.0.1.16.19980523233912.1b9738a2@pop3.demon.co.uk> A fairy godmother made it possible for me to attend the latter half of the Paris meeting and this is a brief report for other XML-DEVers who weren't so fortunate. I only get about one chance a year to meet real-life XML people - a year ago in Barcelona, and occasionally some visiting London. This report is not comprehensive - I missed anything before Wed a.m. The conf was wound up by Tim Bray who asked 'What is a document?' This is important as Tim now categorises himself - and most of us - as indulging in 'document computing'. Tim threw up slides of objects that may or may not be documents - music, a book with no words, a signed baseball, etc. The essential message was that in working with XML we are working with the material of human culture - in many forms - and that we can both enjoy it and have a responsibility. The responsibility is the trust that is put in us to manage information for the benefit of humankind. There is no doubt that 'XML has arrived' and is here to stay. Unfortunately I missed Jon Bosak's plenary (and I missed Jon). His plenary was highly praised and - I believe- again stressed the responsibility that we have to make XML work by always bearing in mind that we are part of a greater community. In a later session Jon outlined the next stages of the XML process. I date not attempt my transcription here as Jon chooses his phrases very carefully and with great meaning. He explained how the namespace proposal had come from the requirement of many W3-members and the narrow path that the WG had to tread. It had been essential to be simple at the outset to avoid committing to something that later might not be found to be workable. He stressed the need for the different W3 groups to work together (e.g. he did not feel a separate - and therefore potentially isolated - group for XML-data would be a good idea.) For many people the joint plenaries on Wednesday (Jean Paoli, Microsoft, and Bernard Feinmann, Netscape) were the key piece of take-home evangelism. Both represented their companies as committed wholeheartedly to XML. JP summarised some of the themes: - 'Data should be free' (Charles Goldfarb) - 'Give the user [power] (Jon Bosak) [I forget the exact words]. and emphasized the critical power of text-lovers and free speech on the WWW. He summarised the many initiatives of vendors and others as showing that XML had really arrived (including CML :-). He talked about XML and HTML interoperating (see a URL at w3.org/TR/NOTE-xh-19980511.html if I got it right) and how