From tbray at textuality.com Sun Nov 1 03:21:14 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:06:08 2004 Subject: CDATA by any other name... (was The raw and the cooked) Message-ID: <3.0.32.19981031192043.00b1b200@pop.intergate.bc.ca> At 10:36 AM 10/30/98 -0500, david@megginson.com wrote: >So, Henry's asking whether this is valid: > > > > > ]> > > >I'd like to hear Tim Bray's opinion, unless I've missed it already in >this thread (are you reading this, Tim, or alternatively, do you have >an e-mail filter that looks for your name?). Yes and no, respectively. I've been lurking, hoping that someone would post something definitive. The more I think about it, the more I think it's valid, because white space between child elements is OK, and the fact that the white space is in a CDATA section doesn't mean it's not white space. Chris Lovett argued that it would be OK if the white space were in an entity reference, which I think is a strongly linked problem (although I couldn't follow Chris' reasoning about why MSXML thinks this the CDATA section is invalid). Larval agrees with me, by the way, because the CDATA recognizer does its work first and the validator only ever sees white space. However, the rule that applies is section 3., validity constraint "Element Valid", list item 2, which I quote: 2. The declaration matches children and the sequence of child elements belongs to the language generated by the regular expression in the content model, with optional white space (characters matching the nonterminal S) between each pair of child elements. Of course, the interpolation "(characters matching the nonterminal S)" could lead a pedant to claim that " Message-ID: <363BDAB1.26A13201@eng.sun.com> As another data point -- Sun's validating parser accepts Henry's original example, no problems. (And it does so very quickly, but you knew that! ;-) A pragmatic answer "why": it uses the data model implied by SAX, which treats characters "quoted" by "" like any other characters (but without using '&' and '<' as markup delimiters). I think that's the right model. It's clear from 2.7 that the text inside a CDATA section is character data, not markup; the example is clear, if the text could be misunderstood. Since 2.4 makes clear (sentence 1!) that the _only_ two sorts of stuff in XML are "character data" and "markup", so there's no way I could justify treating space inside a CDATA section differently from other characters (in terms of data model). Hence it's not possible to distinguish whitespace characters that are the content of a CDATA section from the same text that's outside of a CDATA section. - Dave p.s. Yes, if there's confusion, the spec probably needs to be clarified. Not a crime in any 1.0 spec. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Sun Nov 1 04:15:17 1998 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:06:08 2004 Subject: CDATA by any other name... (was The raw and the cooked) References: <000001be0498$180e4100$b3e887cb@NT.JELLIFFE.COM.AU> Message-ID: <363BDF62.3B59998D@eng.sun.com> Rick Jelliffe wrote: > > marked sections actually mark up > notations: at ISO there has been discussion of whether to allow something > like (for example) > While I applaud the ongoing proliferation of real Java(tm), I admit I don't like that either ... has worked just as well, and does no damage to XML. (Not as pretty though!) > This is not something that I would expect to make its way into XML (and I > think the ISO people are now more keen to help XML/WebSGML than on tidying > up SGML) but I think the idea that a marked section ... but XML has only "CDATA" sections. There's no such thing as a "marked" section, and "CDATA" is specified to be character data terminated by a "]]>" sequence. No notion of marking/labeling. > not only alters > delimiter recognition but also labels the data can be seen (in embryo or > residually) in DOMs elevation of CDATAsection to node-worthiness, which has > so perplexed Henry. Keep in mind that DOM implementations are not required to import an XML document using CDATASection nodes ... Sun's just uses it to determine _how to write out_ the text, if someone adds such a node to a DOM tree they've constructed. The "<" and "&" markup delimiters don't get quoted like they must be for normal text, and "]]>" gets funkified differently. > I think the answer is clear from the spec: > [43] content ::= (element | CharData | Reference | CDSect | PI | Comment)* > so a CDSect is not CharData. Therefore a CDSect is only valid in mixed > content, even though it is well-formed to have it in element content. I can't buy that conclusion. Among other things, that production has no constraints relating to "mixed content". Is this an argument that cosmetic whitespace, comments, and PIs likewise must exist only inside mixed content? - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sun Nov 1 06:15:14 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:06:08 2004 Subject: CDATA by any other name... (was The raw and the cooked) In-Reply-To: <363BDAB1.26A13201@eng.sun.com> Message-ID: <000001be055f$2a480eb0$abe887cb@NT.JELLIFFE.COM.AU> > From: David Brownell > A pragmatic answer "why": it uses the data model implied by > SAX, which treats characters "quoted" by "" > like any other characters (but without using '&' and '<' as > markup delimiters). Aha! I think this is the big difference in approach. The David's are saying that CDATAsects are tags which switch in and out an effect, while I am saying that the CDATAsect is markup which delimits a range and labels it. Personally, I hope CDATAsects are removed from the mooted XML profile (does it have a code name? EZX?), I have never thought they were a particularly good idea. But I guess they wont be, in that it would make it possible to generate EZX documents which were not WF XML documents. [I'd guess EZX would remove DTDs (no entities!), make UTF-8 the only charset, allow but deprecate PIs (in particular before and after fragments or root-elements), allow but deprecate CDATAsects, build-in the ISO public entity sets with HTMLsymbols, and build-in namespaces. I suppose this would then become the syntax used for HTML 5, or whatever HTML+XML is called. That would be a nice little language.] Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lauren at sqwest.bc.ca Sun Nov 1 16:58:56 1998 From: lauren at sqwest.bc.ca (Lauren Wood) Date: Mon Jun 7 17:06:08 2004 Subject: CDATA by any other name... (was The raw and the cooked) In-Reply-To: <000001be0498$180e4100$b3e887cb@NT.JELLIFFE.COM.AU> References: Message-ID: <199811011654.IAA02758@sqwest.bc.ca> On 31 Oct 98, at 17:31, Rick Jelliffe wrote: > Henry Thompson wrote: > > > The DOM made a serious mistake here in my opinion: it's > > stranded in no-person's-land between raw and cooked, without being > > either. It's not cooked, because it gives you EntityReference and > > CDATA nodes. It's not raw, because it DOESN'T give you character > > entity references. > > CHARACTER REFERENCES > I think Henry means "numeric character reference", and this is the heart > of the matter. A numeric character is not an entity, any more than a > directly-entered character is. It is just an alternative encoding of the > character, and should be of no more interest to a general API than the > charset encoding of the document was. (I am putting words into his mouth: > or does Henry mean the [XMLs4.6] predefined entities?) This is the reason that the DOM doesn't give you access through the DOM to the numeric characters. It's perfectly acceptable for the application to give access if it's necessary for that application, but the DOM WG, after a *lot* of discussion, decided that the alternative encodings of a document were not up to the DOM to decide. As for CDATA sections and the DOM - we decided that the DOM could not, in and of itself, decide whether the CDATA section was purely an escaping mechanism that the application (such as an editor) could use or not as it chose or whether the CDATA section had deeper significance. Making CDATA sections nodes means that the application can choose which is true. If the CDATA section is simply an escaping mechanism, then the data can be transformed before being passed to the DOM, in which case the DOM will never see a CDATA section. Should the CDATA section have some other significance, the parser can leave it as a CDATA section and pass it to the DOM, which will respect it. Lauren xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Nov 2 00:10:58 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:06:08 2004 Subject: CDATA by any other name... (was The raw and the cooked) References: <199811011654.IAA02758@sqwest.bc.ca> Message-ID: <363CF738.C389AB26@technologist.com> Lauren Wood wrote: > > If the CDATA section > is simply an escaping mechanism, then the data can be > transformed before being passed to the DOM, in which case the > DOM will never see a CDATA section. Should the CDATA section > have some other significance, the parser can leave it as a CDATA > section and pass it to the DOM, which will respect it. Is there a standard way for the DOM client software to say whether it wants access to CDATA sections or not? -- Paul Prescod - http://itrc.uwaterloo.ca/~papresco "I don't want you to describe to me -- not ever -- what you were doing to that poor boy to make him sound like that; but if you ever do it again, please cover his mouth with your hand," Grandmother said. -- John Irving, "A Prayer for Owen Meany" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Nov 2 00:48:58 1998 From: david at megginson.com (david@megginson.com) Date: Mon Jun 7 17:06:08 2004 Subject: Optional Nodes in the DOM In-Reply-To: <363CF738.C389AB26@technologist.com> References: <199811011654.IAA02758@sqwest.bc.ca> <363CF738.C389AB26@technologist.com> Message-ID: <13884.65089.320319.709774@localhost.localdomain> Paul Prescod writes: > Is there a standard way for the DOM client software to say whether it > wants access to CDATA sections or not? I don't think so -- I'd imagine that that would have to be an option to the DOM builder, which is left unspecified. I could imagine something like this: public class DOMFactory { public final static int NONE = 0; // Optional node types public final static int COMMENTS = 1; public final static int ENTITYREFS = 2; public final static int CDATA = 3; public final static int ALL = COMMENTS | ENTITYREFS | CDATA; public Document createDocument (int flags) throws DOMFactoryException, IOException; // etc. } and then Document doc = factory.createDocument(DOMFactory.NONE); I can also imagine many DOM builders that always leave these nodes types out, since that information is not needed for most XML applications (authoring and repository tools excepted). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Nov 2 01:07:27 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:06:08 2004 Subject: Optional Nodes in the DOM Message-ID: <3.0.32.19981101170427.00b28950@pop.intergate.bc.ca> At 07:47 PM 11/1/98 -0500, david@megginson.com wrote: >I can also imagine many DOM builders that always leave these nodes >types out, since that information is not needed for most XML >applications (authoring and repository tools excepted). Well, authoring anyhow. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Mon Nov 2 02:17:01 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:06:08 2004 Subject: ANN: XML-APP Mailing List Message-ID: <007c01be0606$b0af9520$2ee044c6@arcot-main> I would like to announce the opening of the XML-APP mailing list. WHAT: XML-APP is a mailing list specifically for those interested in applying the XML technology to real world applications. It is not a place to discuss general XML issues nor is it a place for the naive. WHY: I have felt that the quality of messages on the XML-DEV mailing list was too high and too esoteric to encourage sharing of information among those who are experienced enough to see the value of XML at first glance yet don't give a hoot about things like architectural forms. In other words, it is difficult for carpenters to talk shop while architects are about. HOW: You can subscribe by sending a blank message to: mailto:xml-app-subscribe@sunsite.auc.dk The mailing list itself is at: mailto:xml-app@sunsite.auc.dk BUT: It is my opinion that XML-DEV is the center of all XML activities. I have been and will continue to be an active member of the XML-DEV community. XML-APP should be considered a subgroup of XML-DEV community and never as a competing mailing list. FYI: XML-APP is being hosted by SunSITE Denmark. List owners are Don Park of Docuverse and J?rgen Nielsen of SunSITE Denmark. Best, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lauren at sqwest.bc.ca Mon Nov 2 02:17:38 1998 From: lauren at sqwest.bc.ca (Lauren Wood) Date: Mon Jun 7 17:06:08 2004 Subject: CDATA by any other name... (was The raw and the cooked) In-Reply-To: <363CF738.C389AB26@technologist.com> Message-ID: <199811020212.SAA03758@sqwest.bc.ca> On 1 Nov 98, at 18:05, Paul Prescod wrote: > Lauren Wood wrote: > > > > If the CDATA section > > is simply an escaping mechanism, then the data can be > > transformed before being passed to the DOM, in which case the > > DOM will never see a CDATA section. Should the CDATA section > > have some other significance, the parser can leave it as a CDATA > > section and pass it to the DOM, which will respect it. > > Is there a standard way for the DOM client software to say whether it > wants access to CDATA sections or not? No. If people think it would be useful, we could potentially add some sort of "turn CDATA section into Text" method (and/or vice versa) in Level 2. Then the DOM client could run that before accessing the data. I'm not sure how else we could determine access without losing information (I assume you don't mean that information in the CDATA sections should be invisible to the client application). Lauren xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Mon Nov 2 02:41:48 1998 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:06:09 2004 Subject: CDATA by any other name... (was The raw and the cooked) In-Reply-To: <199811020212.SAA03758@sqwest.bc.ca> Message-ID: <001c01be060a$35c316d0$d3228018@jabr.ne.mediaone.net> Lauren Wood wrote: > > No. If people think it would be useful, we could potentially add > some sort of "turn CDATA section into Text" method (and/or vice > versa) in Level 2. Then the DOM client could run that before > accessing the data. I'm not sure how else we could determine > access without losing information (I assume you don't mean that > information in the CDATA sections should be invisible to the client > application). > Alternatively, you can expose both CDATA as an element of type CDATA and as text within the TEXT element. This would preserve the intended behavior w.r.t. text as well as allowing the option of iterating over CDATA elements for interested parties. Jonathan Borden JABR Technology http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lauren at sqwest.bc.ca Mon Nov 2 04:36:02 1998 From: lauren at sqwest.bc.ca (Lauren Wood) Date: Mon Jun 7 17:06:09 2004 Subject: CDATA by any other name... (was The raw and the cooked) In-Reply-To: <001c01be060a$35c316d0$d3228018@jabr.ne.mediaone.net> References: <199811020212.SAA03758@sqwest.bc.ca> Message-ID: <199811020431.UAA03914@sqwest.bc.ca> On 1 Nov 98, at 21:40, Borden, Jonathan wrote: > Alternatively, you can expose both CDATA as an element of type CDATA and > as > text within the TEXT element. This would preserve the intended behavior > w.r.t. text as well as allowing the option of iterating over CDATA > elements for interested parties. Both CDATASection and Text inherit from CharacterData. Is this what you mean? Alternatively, you could use the flattening properties from Node and just look for NodeValue on each. This returns the content of each node without the need for casting. Lauren xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Mon Nov 2 09:09:28 1998 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:06:09 2004 Subject: XSchema 1.0 released Message-ID: <01BE0648.3348E5E0@grappa.ito.tu-darmstadt.de> XSchema 1.0 is now final. Thanks to everyone on XML-Dev who helped make it a reality, and especially to Simon St. Laurent for getting the ball rolling. You can find the general XSchema page at: http://purl.oclc.org/NET/xschema and the final spec at: http://www.simonstl.com/xschema/spec/xscspecv5.htm -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From aheitor at ef.pt Mon Nov 2 14:47:31 1998 From: aheitor at ef.pt (Ana Heitor) Date: Mon Jun 7 17:06:09 2004 Subject: xsl... tables... colspan... Message-ID: Hi, Someone can tell me, how can I write rules in xsl for: 1) have two tables; 2) One table with a variable number of columns. For example: I want evaluate the colspan dynamicaly in function of columns number. Thanks AH xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simpson at polaris.net Mon Nov 2 15:04:57 1998 From: simpson at polaris.net (John E. Simpson) Date: Mon Jun 7 17:06:09 2004 Subject: xsl... tables... colspan... Message-ID: <3.0.32.19981102100504.006a6a2c@polaris.net> Ana -- At 02:43 PM 11/2/98 +0000, Ana Heitor wrote: > [Question about XSL rules] You might find a quicker answer to your question on the XSL mailing list. Information on joining it, and the archives, are at: http://www.mulberrytech.com/xsl/xsl-list Good luck! ============================================================= John E. Simpson | It's no disgrace t'be poor, simpson@polaris.net | but it might as well be. | -- "Kin" Hubbard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Nov 2 15:28:47 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:09 2004 Subject: CDATA by any other name... (was The raw and the cooked) References: <199811011654.IAA02758@sqwest.bc.ca> <363CF738.C389AB26@technologist.com> Message-ID: <363DCFB4.36A59E21@locke.ccil.org> Paul Prescod wrote: > Is there a standard way for the DOM client software to say whether it > wants access to CDATA sections or not? No. In fact, the DOM level 1 does not define any API for the creator, only for the accessor. You can add elements and other things, but you can't create or populate a DOM using only the standard API. DOM Level 1 is an ugly mess, and the only justification for it is to keep Netscape and Microsoft from implementing even uglier incompatible DOMs. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Nov 2 15:32:00 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:09 2004 Subject: CDATA by any other name... (was The raw and the cooked) References: <001c01be060a$35c316d0$d3228018@jabr.ne.mediaone.net> Message-ID: <363DD08C.1A11CAE8@locke.ccil.org> Borden, Jonathan wrote: > Alternatively, you can expose both CDATA as an element of type CDATA and as > text within the TEXT element. This would preserve the intended behavior > w.r.t. text as well as allowing the option of iterating over CDATA elements > for interested parties. Actually, the only thing you can do with a Text node that you can't do with a CDATA node is merge it with an adjacent Text node, a very minor capability which can easily be simulated. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Mon Nov 2 16:10:58 1998 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:06:09 2004 Subject: DTD,s Message-ID: <008401be067b$4db05b60$3bacdccf@ix.netcom.com> Does any one know of a site where XML dtd's are available for general use? If not 1.Would there be a need for such a site. 2.Would anyone be prepared to donate some dtd's to such a site. I have several xml dtd's including xml dtd's for html strict and transitional that I could make available. regards, Frank Frank Boumphrey XML and style sheet info at Http://www.hypermedic.com/style/index.htm Author: - Professional Style Sheets for HTML and XML http://www.wrox.com CoAuthor: Professional XML applications form Wrox Press, www.wrox.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Mon Nov 2 16:28:51 1998 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:06:09 2004 Subject: CDATA by any other name... (was The raw and the cooked) Message-ID: <199811021628.QAA09805@cogsci.ed.ac.uk> One reason to regard a CDATA section as equivalent to the characters in it is that it is prefectly reasonable for a processor to transform a CDATA section into plain character data with character entities where required. A processor that outputs canonical XML will do this. Suppose we regard as invalid in element-only content. If the processor is non-validating, it will not check this, and will produce valid output from invalid input. You might for example run a document through such a processor merely to change its character encoding. It would be unfortunate if this process changed the document's validity. Slightly less plausibly, a processor might decide to output all character data as CDATA sections to avoid using character entities. This process could make a valid document invalid. -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Mon Nov 2 16:46:47 1998 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:06:09 2004 Subject: DTD,s Message-ID: <006001be067f$f4ec6960$0300000a@othniel.cygnus.uwa.edu.au> -----Original Message----- From: Frank Boumphrey >Does any one know of a site where XML dtd's are available for general use? That is what schema.net is for. At present it is a catalogue but will soon house DTDs. (Actually it already has an increasing number of entities, thanks to Rick Jelliffe) It also has an SGML Open catalog that uses my delegate idea to allow resolution of formal public identifiers. The XBEL DTD developed by the Python XML-SIG has already made use of this. >I have several xml dtd's including xml dtd's for html strict and >transitional that I could make available. Send them my way and I'll add them to schema.net. James -- James Tauber / jtauber@jtauber.com / www.jtauber.com Associate Researcher, Electronic Commerce Network Curtin University of Technology, Perth, Western Australia Maintainer of : www.xmlinfo.com, www.xmlsoftware.com and www.schema.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Mon Nov 2 17:25:53 1998 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:06:09 2004 Subject: CDATA by any other name... (was The raw and the cooked) References: <199811011654.IAA02758@sqwest.bc.ca> <363CF738.C389AB26@technologist.com> <363DCFB4.36A59E21@locke.ccil.org> Message-ID: <363DE9F3.2EDF9979@eng.sun.com> John Cowan wrote: > > DOM Level 1 is an ugly mess, and the only justification for it > is to keep Netscape and Microsoft from implementing even uglier > incompatible DOMs. I think everyone recognizes some of the compromises that went into DOM, and has a list of some mistakes they'd fix. But I don't think there's a good consensus on which things are mistakes rather than features ... To put it differently: is there really room for another API to represent XML structure? I tend to think that DOM, warts and all, is "good enough" for most purposes. And for those other purposes, I suspect that no standard API could suit. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at sunsite.unc.edu Mon Nov 2 17:43:17 1998 From: elharo at sunsite.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:06:09 2004 Subject: Web pages in non-Roman scripts In-Reply-To: > Message-ID: For my next book about XML I am seeking examples of Web pages in non-Roman scripts: Cyrillic, Greek, Chinese, Japanese, etc. The purpose is to include before and after screen shots showing them with and without the proper fonts and encodings. If you maintain such a web site, and you're willing to sign a permissions form dreamed up by IDG's lawyers, please email me the URL, your snail mail address and FAX number and I'll send you the permissions letter to sign. To thank you for your trouble, I'll also send you a copy of my current book--XML: Extensible Markup Language--when I get the signed permission agreement back. Finally, if your site is included in the finished book (sites will also have to be approved by my editors) I'll also send you a copy of the next book when it's published. Pretty much any site in a non-Roman script will do. However, I do have a preference for interesting pages like one discussing China's human rights record in Chinese or the text of War and Peace in Russian, as opposed to corporate home pages. But I'll take whatever I can get. If you're interested, please send private email to elharo@sunsite.unc.edu. Thanks. +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@sunsite.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML: Extensible Markup Language (IDG Books 1998) | | http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eddie.sheffield at enterworks.com Mon Nov 2 19:39:47 1998 From: eddie.sheffield at enterworks.com (Eddie Sheffield) Date: Mon Jun 7 17:06:09 2004 Subject: DTD,s References: <006001be067f$f4ec6960$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: <363E08D0.C0390CDD@enterworks.com> James Tauber wrote: > -----Original Message----- > From: Frank Boumphrey > > >Does any one know of a site where XML dtd's are available for general use? > > That is what schema.net is for. At present it is a catalogue but will soon > house DTDs. > (Actually it already has an increasing number of entities, thanks to Rick > Jelliffe) There is also the CommerceNet XML Exchange at http://www.xmlx.com that claims to have the same purpose, but I've been checking on them for several months now and there is absolutely no action there. Lack of promotion, I guess. But it is organized around forums (such as Automotive, History, Genealogy, Workflow, etc.), apparently with the notion that people would come together, post DTDs, and actively develop them online. But except for September archives which only contain the forum welcome messages, there is nothing. Does anyone know of anywhere else where one can go to discuss the development of DTDs? XML Exchange sounded perfect, but there are not really any appropriate catagories for my ideas, and there was no response to my request to create a new "Household" or "Consumer" forum. I have several home user ideas I'd like to see fleshed out - recipes, grocery lists, collections (stamps, comics, etc.), TV listings, etc. but I don't really have the experience in DTD design to tackle it. Maybe I need to tackle David's (Megginson) book again. ;-) BTW, did anyone ever give any references for "Z" from the recent CDATA thread? My curiosity is perked, but I have a feeling searching for "Z" on Hotbot or Yahoo or wherever would prove frustrating! Thanks. Eddie Sheffield xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From hussain at granularity.com Mon Nov 2 20:02:08 1998 From: hussain at granularity.com (G. Hussain Chinoy) Date: Mon Jun 7 17:06:10 2004 Subject: DTD,s In-Reply-To: <008401be067b$4db05b60$3bacdccf@ix.netcom.com> Message-ID: Are people aware of this site? A repository of public sgml/xml texts http://www.ucc.ie/cgi-bin/PUBLIC which is referenced on... The SGML/XML Web Page (robin cover): XML/SGML Name Registration http://www.oasis-open.org/cover/xml.html#xmlNameRegistry and related to.. The GCA's public identifier registration process http://www.gca.org/publicid/ ----------------------------------------- G. Hussain Chinoy hussain@granularity.com Chief Information Architect, CEO Granularity Information Architecture, Inc. http://www.granularity.com/ On Mon, 2 Nov 1998, Frank Boumphrey wrote: > Does any one know of a site where XML dtd's are available for general use? > > If not > 1.Would there be a need for such a site. > 2.Would anyone be prepared to donate some dtd's to such a site. > > I have several xml dtd's including xml dtd's for html strict and > transitional that I could make available. > > regards, > Frank > Frank Boumphrey > > XML and style sheet info at Http://www.hypermedic.com/style/index.htm > Author: - Professional Style Sheets for HTML and XML http://www.wrox.com > CoAuthor: Professional XML applications form Wrox Press, www.wrox.com > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DKACKMAN at agchem.com Mon Nov 2 21:11:08 1998 From: DKACKMAN at agchem.com (Don Kackman) Date: Mon Jun 7 17:06:10 2004 Subject: Retrieving attributes from an internal entity Message-ID: <8424EFA3C1F7D1118B7300A0C9C57C192084D4@mpl_nt9.agchem.com> Hello, I'm using Microsoft's XML parser that comes as part of IE 5 beta 1 as a component of an application that will use XML as its document format. Since IE5 is still a beta I'm having some trouble determining if certain behaviors are bugs in the current version of their parser or correctly reflect the W3C specification. Namely I'm using an internal entity declaration as follows: OM"> as part of the internal part of the DTD. I can load the document into MSXML (thier parser) and traverse the node tree. When I get to the node where I am refering to the &om; entity I get OM back as the value of that node but I cannot retrieve the targetset attribute. It is my understanding that internal entities should be parsed in place when they are refered to, which should mean that I can treat that node as I would any other. This does not seem to be the case with the MS parser. Is this a limitation of the MS beta parser or am I misunderstanding how entities are used in XML? Thank you, Don Kackman dkackman@agchem.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Nov 2 21:17:52 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:06:10 2004 Subject: Retrieving attributes from an internal entity Message-ID: <3.0.32.19981102131628.00aef4a0@pop.intergate.bc.ca> At 03:10 PM 11/2/98 -0600, Don Kackman wrote: >OM"> > >I can load the document into MSXML (thier parser) and traverse the node >tree. When I get to the node where I am refering to the &om; entity I >get OM back as the value of that node but I cannot retrieve the >targetset attribute. > >Is this a limitation of the MS beta parser or am I misunderstanding how >entities are used in XML? You're fine, you should be able to retrieve that attribute. You should report this back to Microsoft ASAP, I'm sure they'll fix it. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DKACKMAN at agchem.com Mon Nov 2 21:37:28 1998 From: DKACKMAN at agchem.com (Don Kackman) Date: Mon Jun 7 17:06:10 2004 Subject: Retrieving attributes from an internal entity Message-ID: <8424EFA3C1F7D1118B7300A0C9C57C192084D6@mpl_nt9.agchem.com> Thanks for the quick reply Tim. How about this one... I'm declaring the following entity: "> When I try to load this with MSXML I get this error: A name was started with an invalid character. Line 0000003: ...getset='om'>OM"> Pos 0000070: ...------------------------------------------^ If I change % to any other character it loads fine. Can I use the % symbol in an entity declaration? I suspect it thinks I'm trying to insert a parameter entity. Thanks again, Don -----Original Message----- From: Tim Bray [mailto:tbray@textuality.com] Sent: Monday, November 02, 1998 3:18 PM To: Don Kackman; 'XML Dev' Subject: Re: Retrieving attributes from an internal entity At 03:10 PM 11/2/98 -0600, Don Kackman wrote: >OM"> > >I can load the document into MSXML (thier parser) and traverse the node >tree. When I get to the node where I am refering to the &om; entity I >get OM back as the value of that node but I cannot retrieve the >targetset attribute. > >Is this a limitation of the MS beta parser or am I misunderstanding how >entities are used in XML? You're fine, you should be able to retrieve that attribute. You should report this back to Microsoft ASAP, I'm sure they'll fix it. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Philippe.Le_Hegaret at sophia.inria.fr Tue Nov 3 01:23:28 1998 From: Philippe.Le_Hegaret at sophia.inria.fr (Philippe Le Hégaret) Date: Mon Jun 7 17:06:10 2004 Subject: ANN: KOML 1.1 released Message-ID: <363E5B04.FA5AF290@sophia.inria.fr> KOML is an XML application to serialize Java Objects in an XML document. This application is called KOML for Koala Object Markup Language. This new version includes bug fix and a minor change in the language. It is backward compatible with the version 1.0 . values (except transient) have a name attribute. (thanks to Robert Nielsen for his feedback) Bug fix with Class objects. Bug fix in close() methods. (thanks to Raj) Remove File constructors. Now you have: KOMLSerializer(Writer out, boolean buffered) KOMLDeserializer(Reader out, boolean buffered) A new KOML document : -1 -1 /koala www.inria.fr http Regards, Philippe. --------- Philippe Le Hegaret Philippe.Le_Hegaret@sophia.inria.fr -- http://www.inria.fr/koala/plh/ KOALA/DYADE/BULL @ INRIA - Sophia Antipolis xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Tue Nov 3 01:34:23 1998 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:06:10 2004 Subject: Z references (was Re: DTD,s) Message-ID: <00fb01be06c9$b0228cc0$0300000a@othniel.cygnus.uwa.edu.au> -----Original Message----- From: Eddie Sheffield >BTW, did anyone ever give any references for "Z" from the recent CDATA thread? My >curiosity is perked, but I have a feeling searching for "Z" on Hotbot or Yahoo or >wherever would prove frustrating! Yahoo was easy. Try: http://dir.yahoo.com/Computers_and_Internet/Programming_Languages/Z/ which leads to an excellent site: http://www.comlab.ox.ac.uk/archive/z.html James -- James Tauber / jtauber@jtauber.com / www.jtauber.com Associate Researcher, Electronic Commerce Network Curtin University of Technology, Perth, Western Australia Maintainer of : www.xmlinfo.com, www.xmlsoftware.com and www.schema.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jieli at cs.umbc.edu Tue Nov 3 02:23:26 1998 From: jieli at cs.umbc.edu (Li Jiefeng) Date: Mon Jun 7 17:06:10 2004 Subject: How to call JS function in .xsl file? Message-ID: Hello, I am wondering how to call a JavaScript function in .xsl file. For instance, ... I tried abc(x); abc(x) "=abc(x)" but all failed. Thx for your help. Jiefeng --------------------------------------------------------------------- Jiefeng Li, CSEE, UMBC (410)455-2837(L), (410)455-3094(O), (410)242-9610(H) http://www.cs.umbc.edu/~jieli xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Tue Nov 3 08:02:23 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:06:10 2004 Subject: Web pages in non-Roman scripts In-Reply-To: Message-ID: <000b01be0700$7d00c730$11e887cb@NT.JELLIFFE.COM.AU> > From: Elliotte Rusty Harold > Pretty much any site in a non-Roman script will do. However, I do have a > preference for interesting pages like one discussing China's human rights > record in Chinese or the text of War and Peace in Russian, as opposed to > corporate home pages. Not much Chinese XML here in Taiwan yet, because of technology lag. There are some interesting projects in the pipes though. I dont know about other Chinese countries. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Tue Nov 3 10:39:46 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:06:10 2004 Subject: CDATA by any other name... (was The raw and the cooked) Message-ID: <002301be0715$a205f680$7008e391@bra01wmhkay.bra01.icl.co.uk> >I do *not* agree that XML won't come into its own until we bypass >all the syntax and think only in terms of abstract data structures. >Having watched this profession for 20 years ago, I have come to >believe that a truly interoperable API is very nearly an oxymoron; >but syntax is something we know how to interoperate with. Also I >just don't believe that there is One True data model for XML. I agree that defining what is and is not well-formed and valid XML ought to be a readily achievable goal, and it is a little surprising to find an area where the spec is ambiguous on the matter. Hence my suggestion for a formal analysis to discover whether there are other unsuspected problems. I also agree that defining what a conformant XML processor should do with that XML (not to mention what it should do with erroneous XML) is considerably harder, though I think the problem becomes tractable if the behaviour is defined in terms of a concrete API such as SAX or DOM. I agree with those who have pointed out that formalisms like Z are not a good vehicle for communicating a standard to a wide audience. In my own experience, however, the kind of thinking required to produce a formal specification in Z is invaluable when trying to produce an unambiguous one in clear English. I don't believe that precision and readability are incompatible goals. There is information about Z, by the way, on http://www.non.com/news.answers/z-faq.html Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Tue Nov 3 10:47:23 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:06:10 2004 Subject: CDATA by any other name... (was The raw and the cooked) Message-ID: <002901be0716$b5121a50$7008e391@bra01wmhkay.bra01.icl.co.uk> >> marked sections actually mark up >> notations: at ISO there has been discussion of whether to allow something >> like (for example) >> > >While I applaud the ongoing proliferation of real Java(tm), I admit I >don't like that either ... has >worked just as well, and does no damage to XML. (Not as pretty though!) Neither really works well, because "]]>" can legitimately occur in a Java program. For example, it is quite likely to occur in a Java program that generates XML. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Tue Nov 3 11:52:09 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:06:10 2004 Subject: CDATA by any other name... (was The raw and the cooked) In-Reply-To: <002901be0716$b5121a50$7008e391@bra01wmhkay.bra01.icl.co.uk> Message-ID: <000001be0720$918764f0$aee887cb@NT.JELLIFFE.COM.AU> A CDATA marked section is not only a way to prevent delimiter recognition. It is also a way to declare that the characters in that section are limited to ones available in the direct document encoding of the originating system. (SGML has a CDATA keyword you can use instead of content models: XML was felt not to need it because you could use From: Michael Kay > Sent: Tuesday, 3 November 1998 21:43 > To: xml-dev@ic.ac.uk > Subject: Re: CDATA by any other name... (was The raw and the cooked) > > > >> marked sections actually mark up > >> notations: at ISO there has been discussion of whether to > allow something > >> like (for example) > >> > > > >While I applaud the ongoing proliferation of real Java(tm), I admit I > >don't like that either ... has > >worked just as well, and does no damage to XML. (Not as pretty though!) > > > Neither really works well, because "]]>" can legitimately occur in a Java > program. For example, it is quite likely to occur in a Java program that > generates XML. The idea was not that JAVA would be a "CDATA marked section", but an "RCDATA marked section", which means that special character references and entity references would be allowed. XML does not have RCDATA marked sections, in the interests of simplicity. So "]]>" might have been a possibility for SGML, but it is not for XML. Why have anything like this? The primary reason (apart from orthogonality) to me is the contention that if you make element structure do too much, you make the structure difficult to model with simple schema notations. For example, think of a "wrapper" element type. (This is a pattern, by the way.) For example, the RDF elements. Using a foreign wrapper element in a document means that * you will have to rewrite the content models in order to validate the document. Or, * you have to create a more complicated schema convention (e.g., ** call the existing DTD an architecture and make it external, then use the RDF DTD as the DTD of the current document and make dummy declarations with ANY content models for all the old document or ** make up schema definition languages that rely on more than one level of context) But if, instead of a wrapper element, you used PIs for the wrappers, then the content model is undisturbed, and the element structure keeps its previous simplicity and the goals of its original authors. It would be nice if W3C allowed this, but the less that a PI can be treated (by XLL or DOM or SAX or whatever) as a kind of element, the less that this kind of simplicity is possible. I have little sympathy for some of the people who say content models are inexpressive, when they deliberately choose to ignore other the markup options available. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Nov 3 12:01:31 1998 From: david at megginson.com (david@megginson.com) Date: Mon Jun 7 17:06:10 2004 Subject: CDATA by any other name... (was The raw and the cooked) In-Reply-To: <000001be0720$918764f0$aee887cb@NT.JELLIFFE.COM.AU> References: <002901be0716$b5121a50$7008e391@bra01wmhkay.bra01.icl.co.uk> <000001be0720$918764f0$aee887cb@NT.JELLIFFE.COM.AU> Message-ID: <13886.61181.97569.898100@localhost.localdomain> Rick Jelliffe writes: > A CDATA marked section is not only a way to prevent delimiter > recognition. It is also a way to declare that the characters in > that section are limited to ones available in the direct document > encoding of the originating system. (SGML has a CDATA keyword you > can use instead of content models: XML was felt not to need it > because you could use mind of the XML WG at that time, in that they were down-playing the > need for schemas.) It declares "this section does not use character > references or entities or subelements". So, conceptually, it could > sometimes be markup, not merely delimiter recognition. While I agree that there are always interesting new uses for markup constructions, I think that we're straining here. My basic rule in system design is to keep things as simple and obvious as possible; if I wanted to signal to my application that an element contained only a certain type of information (such as a limited character repetoire), I would use an attribute that made that point clear, either a NOTATION attribute or a simple CDATA attribute named something like "character-encoding". That said, I don't see the usefulness of limiting content to a specific character repetoire arbitrarily; I *do* see the usefulness in combination with an "xml:lang" or "mime-type" attribute, though. An intelligent editor could already act on xml:lang to limit character selection, if such a thing were desirable. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From north at Synopsys.COM Tue Nov 3 13:00:16 1998 From: north at Synopsys.COM (Simon North) Date: Mon Jun 7 17:06:10 2004 Subject: XML in IE5 beta PR2 In-Reply-To: <000001be0720$918764f0$aee887cb@NT.JELLIFFE.COM.AU> References: <002901be0716$b5121a50$7008e391@bra01wmhkay.bra01.icl.co.uk> Message-ID: <199811031257.NAA09118@goofy.gr05.synopsys.com> Hi Gurus, I'm now experimenting with XML in IE5 beta preview release 2. It's nice to be able to parse the XML on load and to actually to be able to navigate the structure tree. It does, however, complain about not being able to load the XSL code (though it does seem to do a good job of supporting CSS). Has anyone got any further than this? Thanks, Simon North xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Tue Nov 3 13:49:22 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:06:11 2004 Subject: CDATA by any other name... (was The raw and the cooked) References: <199811011654.IAA02758@sqwest.bc.ca> <363CF738.C389AB26@technologist.com> <363DCFB4.36A59E21@locke.ccil.org> <363DE9F3.2EDF9979@eng.sun.com> Message-ID: <363F060D.47FB0E2C@technologist.com> David Brownell wrote: > > To put it differently: is there really room for another API > to represent XML structure? > > I tend to think that DOM, warts and all, is "good enough" for > most purposes. And for those other purposes, I suspect that > no standard API could suit. I find it odd that we can have "standard APIs" for the full complexity of relational data, and probably eventually for object database data, but it is perceived to be impossible to do the same for the parse tree of XML data. I mean it is just annotated tree structures: it shouldn't be rocket science (but neither is it trivial). No, we don't have such a thing yet, because it is not easy to develop and nobody is willing to stop and think things through. Over time, organizations like TechnoTeacher and ISOGEN *are* thinking it through. I don't claim we've got the problem solved, but our direction is already much more scalable, generalized and rigorous than what we are seeing in the DOM realm. Our approach is, we think, the same as the one taken by the relational database people: first think of a model that supports the range of applications that we want to support (including editing applications, repositories, simple read-only processors) and data types that we want to support (documents, DTDs, schemas, "link maps", vector and bitmap graphics,... all media). Having defined the model, we need a way to customize it for a particular application: a schema, just as they have schemas in the relational and object database worlds. Our schemas are property sets (the schema language needs to be stronger, if it is to support read-write applications...we know that part needs work). Then we develop an API to encapsulate the model. We are working on that API right now. Anyone who wants to follow our thinking can start with the tutorial on groves at http://www.prescod.net/groves/shorttut As you can see from the tutorial, the model is simpler than the relational model and yet seems more or less complete (I know of one suggestion for enhancement). As I said before, the schema language and the APIs are the parts that must change now. If there is a reason that this generalized approach *must* fail and cannot be the basis of a variety of applications, then I would like to hear about it sooner than later, so I invite comments from skeptics. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "I always wanted to be somebody, but I should have been more specific." --Lily Tomlin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Tue Nov 3 14:16:01 1998 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:06:11 2004 Subject: Retrieving attributes from an internal entity In-Reply-To: Don Kackman's message of Mon, 2 Nov 1998 15:37:17 -0600 Message-ID: <199811031414.OAA12936@cogsci.ed.ac.uk> > "> This is indeed being (correctly) interpreted as a malformed parameter entity. Use a character entity to refer to the percent character: "> -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Nov 3 14:31:00 1998 From: david at megginson.com (david@megginson.com) Date: Mon Jun 7 17:06:11 2004 Subject: Standard XML APIs (was Re: CDATA by any other name...) In-Reply-To: <363F060D.47FB0E2C@technologist.com> References: <199811011654.IAA02758@sqwest.bc.ca> <363CF738.C389AB26@technologist.com> <363DCFB4.36A59E21@locke.ccil.org> <363DE9F3.2EDF9979@eng.sun.com> <363F060D.47FB0E2C@technologist.com> Message-ID: <13887.4266.208216.446955@localhost.localdomain> Paul Prescod writes: > I find it odd that we can have "standard APIs" for the full > complexity of relational data, and probably eventually for object > database data, but it is perceived to be impossible to do the same > for the parse tree of XML data. I mean it is just annotated tree > structures: it shouldn't be rocket science (but neither is it > trivial). Let's divide the use of XML into two fairly arbitrary groups (acknowledging that there's considerable overlap): 1. Documents 2. Data Group #1 (documents) is characterised by long sequences of mixed content inside block-level containers (often paragraphs, but possibly subtasks or steps in technical documentation); group #2 (data) is characterised by fairly rigid hierarchies with plain character data inside named fields (often, but not always, short) appearing in predictable orders. A standard XML-oriented API like the DOM is entirely suitable for group #1, but the DOM is probably overkill for group #2, which requires a domain-specific API (of course, for small or non-speed-critical applications, the domain-specific API could be implemented as an adapter on top of the DOM). That said, you still need some kind of API to get at the XML to populate the domain-specific model. Sometimes, the DOM will be appropriate, but given that many models in group #2 tend to be simple and need to be processed on busy servers, a light-weight, event-based API like XML::Parser or SAX usually makes the most sense for that group. Ideally, most programmers using XML for group #2 will never see an XML-specific API -- we should hide it. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Tue Nov 3 15:23:50 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:06:11 2004 Subject: Retrieving attributes from an internal entity Message-ID: <3.0.32.19981103071922.00ae7820@pop.intergate.bc.ca> At 02:14 PM 11/3/98 GMT, Richard Tobin wrote: >> "> > >This is indeed being (correctly) interpreted as a malformed parameter >entity. Use a character entity to refer to the percent character: Oops. Oh dear, Richard is right. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Nov 3 16:18:02 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:11 2004 Subject: CDATA by any other name... (was The raw and the cooked) References: <002901be0716$b5121a50$7008e391@bra01wmhkay.bra01.icl.co.uk> Message-ID: <363F2CDA.3BF7BB3D@locke.ccil.org> Michael Kay wrote: > Neither really works well, because "]]>" can legitimately occur in a Java > program. For example, it is quite likely to occur in a Java program that > generates XML. If "]]>" is needed as a string literal or part of one, it can easily be replaced by "]]\76" or "]]\u003E". If it appears in program text, then "]] >" will be a sufficient replacement. Similar workarounds are used to avoid ETAGOs (" Message-ID: <363F31B6.59F3FFC7@locke.ccil.org> Rick Jelliffe wrote: > A CDATA marked section is not only a way to prevent delimiter recognition. > It is also a way to declare that the characters in that section are limited > to ones available in the direct document encoding of the originating system. True. However, since the standard encodings of XML include all the characters there are (and if they don't include yours, just you wait, 'Enry 'Iggins), that isn't as much of an issue. > (SGML has a CDATA keyword you can use instead of content models: XML was > felt not to need it because you could use shows the mind of the XML WG at that time, in that they were down-playing > the need for schemas.) CDATA elements are eeeeeevil. They terminate at any ETAGO followed by a name-start character, and they make it impossible to change your mind later, if you decide you need an entity or two. See the excellent articles at. They were rightly discarded from XML. > For example, I cannot see why a smart editor could not use the CDATA section > to cofine editing to whatever the repertoire of the character set of the > encoding attribute of the XML header says. IMHO, a *smart* editor would realize that a CDATA section cannot cope, and would terminate it around the problem character. For example, an attempt to insert a dagger (U+2020) into a CDATA section within an 8859-1 document would produce this: ... ]]>† In the case of editing the XML > specification, for example, when there is a CDATA marked section being > edited, and the editor types "<", a smart section should know not to replace > it with "<" or expect it to be a STAGO. XED indeed has this property, although it just feeps if you attempt to type a character that would cause "]]>" to appear in a CDATA section, rather than splitting the section (which admittedly would be painful to undo after a Backspace character). > It would be nice > if W3C allowed this, but the less that a PI can be treated (by XLL or DOM or > SAX or whatever) as a kind of element, The current XPointer draft allows PIs to be referred to on equal terms with elements (except for not having a GI or attributes or sub-elements). The DOM has a ProcessingInstruction node, though pseudo-attribute parsing is not performed. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lauren at sqwest.bc.ca Tue Nov 3 16:50:11 1998 From: lauren at sqwest.bc.ca (Lauren Wood) Date: Mon Jun 7 17:06:11 2004 Subject: CDATA by any other name... (was The raw and the cooked) In-Reply-To: <363F31B6.59F3FFC7@locke.ccil.org> Message-ID: <199811031644.IAA14775@sqwest.bc.ca> On 3 Nov 98, at 11:39, John Cowan wrote: > The current XPointer draft allows PIs to be referred to on equal terms > with elements (except for not having a GI or attributes or sub-elements). > > The DOM has a ProcessingInstruction node, though pseudo-attribute parsing > is not performed. This would be a possibility, but this reading of the content of a PI isn't in the XML spec, so the DOM WG didn't want to add semantics that weren't in the spec. So we stuck to the simple target+data approach, at least for Level 1. Lauren xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Tue Nov 3 17:07:51 1998 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:06:11 2004 Subject: CDATA by any other name... (was The raw and the cooked) In-Reply-To: Rick Jelliffe's message of Tue, 3 Nov 1998 22:53:27 +1100 Message-ID: <199811031707.RAA19890@cogsci.ed.ac.uk> > (SGML has a CDATA keyword you can use instead of content models: XML was > felt not to need it because you could use shows the mind of the XML WG at that time, in that they were down-playing > the need for schemas.) Surely the unanswerable argument against CDATA elements in XML was they prevent you from parsing a document without the DTD. Just like optional start/end tags, and unmarked empty elements. -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Nov 3 17:10:47 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:11 2004 Subject: CDATA by any other name... (was The raw and the cooked) References: <000001be0720$918764f0$aee887cb@NT.JELLIFFE.COM.AU> <363F31B6.59F3FFC7@locke.ccil.org> Message-ID: <363F38FE.5893FDE9@locke.ccil.org> Blunderingly I wrote: > See the excellent > articles at. That should have been: "at http://www.oasis-open.org/cover/topics.html#CDATA ." -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Nov 3 17:12:40 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:11 2004 Subject: CDATA by any other name... (was The raw and the cooked) References: <199811031644.IAA14775@sqwest.bc.ca> Message-ID: <363F3939.84DD643@locke.ccil.org> Lauren Wood replied to me: > > The DOM has a ProcessingInstruction node, though pseudo-attribute parsing > > is not performed. > > This would be a possibility, but this reading of the content of a PI > isn't in the XML spec, so the DOM WG didn't want to add > semantics that weren't in the spec. So we stuck to the simple > target+data approach, at least for Level 1. In the words of Hyman Kaplan: "I described. I did not condemn." -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Tue Nov 3 17:24:03 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:06:11 2004 Subject: CDATA by any other name... (was The raw and the cooked) In-Reply-To: <199811031707.RAA19890@cogsci.ed.ac.uk> Message-ID: <000001be074e$e2ec19c0$dae887cb@NT.JELLIFFE.COM.AU> > From: Richard Tobin [mailto:richard@cogsci.ed.ac.uk] > Surely the unanswerable argument against CDATA elements in XML was > they prevent you from parsing a document without the DTD. Just like > optional start/end tags, and unmarked empty elements. A good reason, but you could always say that "every CDATA element must have an attribute xml:content-mode='CDATA'". So not unanswerable (though neither nor the attribute commend themselves). And not unthinkable, as xml:lang and xml:space prove. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From greynolds at datalogics.com Tue Nov 3 17:32:13 1998 From: greynolds at datalogics.com (Reynolds, Gregg) Date: Mon Jun 7 17:06:11 2004 Subject: Z references (was Re: DTD,s) Message-ID: <51ED3F5356D8D011A0B1006097C30734014E5B3D@martinique> Since it's kind of hard to find at the site mentioned below, here is a link to some drafts of the ISO Z standard: http://www.cs.york.ac.uk/~ian/zstan/ The Z Reference Manual, by J.M. Spivey, is frequently referred to as the de facto standard; lucky for us, it has gone out of print and Mr. (Ms?) Spivey has been kind enough to make it available on the net at: http://spivey.oriel.ox.ac.uk/~mike/zrm/ I've found "The Way of Z" very useful: http://www.radonc.washington.edu/prostaff/jon/z-book/ -----Original Message----- From: James Tauber [mailto:jtauber@jtauber.com] Sent: Monday, November 02, 1998 7:31 PM To: xml mailing list Subject: Z references (was Re: DTD,s) -----Original Message----- From: Eddie Sheffield >BTW, did anyone ever give any references for "Z" from the recent CDATA thread? My >curiosity is perked, but I have a feeling searching for "Z" on Hotbot or Yahoo or >wherever would prove frustrating! Yahoo was easy. Try: http://dir.yahoo.com/Computers_and_Internet/Programming_Languages/Z/ which leads to an excellent site: http://www.comlab.ox.ac.uk/archive/z.html James -- James Tauber / jtauber@jtauber.com / www.jtauber.com Associate Researcher, Electronic Commerce Network Curtin University of Technology, Perth, Western Australia Maintainer of : www.xmlinfo.com, www.xmlsoftware.com and www.schema.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Tue Nov 3 17:40:38 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:06:11 2004 Subject: CDATA by any other name... (was The raw and the cooked) In-Reply-To: <363F31B6.59F3FFC7@locke.ccil.org> Message-ID: <000101be0751$1ee370c0$dae887cb@NT.JELLIFFE.COM.AU> > From: John Cowan > True. However, since the standard encodings of XML include all the > characters there are (and if they don't include yours, just you > wait, 'Enry 'Iggins), that isn't as much of an issue. (An optimistic view of ISO10646: there are dozens of new Han ideographs created every day, apart from other scripts.) The situation I am thinking of is, for example, where I am creating an XML document which will be used, after processing by a non-XML Macintosh application that only understands MacRoman. The CDATA marked section is the only constraining/signalling mechanism in XML which could be applied, and it goes without saying that it is a pretty poor one, but I don't want to say it is useless. If the consensus of developers is that they dont want to allow marked sections to be used in this way, I hope that the schema people will look at a solution for constraining strings to use certain repertoires of characters. I believe the Balise parser and SGML/XML processing system has a "sanity checking" option for names in markup for this kind of repertoire-limitation purpose. > The current XPointer draft allows PIs to be referred to on equal > terms with elements (except for not having a GI or attributes or > sub-elements). > The DOM has a ProcessingInstruction node, though pseudo-attribute > parsing is not performed. Which is my point. By not even providing some minimal kind of token-locating within a processing instruction for people to use if they need it, PIs are barely useful as far as I can see. People will always try to use what is provided, rather than extend an API, so it is almost the kiss of death to PIs except for the more sophisticated applications. When schemas come along with better lexing of attribute values and PCDATA, I wonder if they will also bother to allow scanning of the PI into tokens too. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Nov 3 17:47:46 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:11 2004 Subject: XML APIs References: <3.0.5.32.19981102121408.00c96c30@pophost.arbortext.com> <363E08B6.2EB84756@locke.ccil.org> <363E2F22.4EE275D9@locke.ccil.org> Message-ID: <363F41DE.A57015FE@locke.ccil.org> Stephen R. Savitzky wrote: > I do not understand your point. They certainly hold as far as I can tell; Provided the tree doesn't get modified while traversing it. > Piling additional complication into the specification in order to ensure > that every node in the tree will continue to be visited no matter what gets > done between calls to "toNext", which I believe is what the last spec that > included iterators attempted to do, is WRONG, because it makes the simple > implementation impossible and because it becomes too complicated for a > programmer looking at the spec to guess how it's going to behave. Well, I disagree with you. If iterators are to be useful, they must be robust against changes to the structure being iterated over, or at the very least they must warn that the iterator is no longer valid, like the new Java 1.2 enumerators. > The API is designed to have an obvious model that looks > like a parse tree. Any programmer, looking at that API, will ``see'' the > parse tree in her mind's eye and be able to make intuitive and accurate > predictions about how it will behave. Indeed. But soon after learning about live node lists, this model will have to be changed or her programs will be dreadfully erroneous. > They will then discover that, in > the details of the specification, the intuitive view of the DOM as the API > for tree-structured documents is WRONG, and that a great deal of non-obvious > machinery has to be added in order to make it work. You betcha. > I'm going to go a little further, and define ``natural model.'' The natural > model of an interface is a class in which all attributes are represented by > instance variables, and no other instance variables are present. The trouble with such a "natural model" is that it's dead. It works perfectly for values (which have no state, i.e. are immutable), and for "dart boards" that react to whatever's posted to (thrown at?) them, but not for anything with any liveness. A robot modeled by such a "natural model" would be more like a Barbie doll: poseable, but unable to move by itself. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Nov 3 17:50:27 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:11 2004 Subject: Walking the DOM (was: XML APIs) References: <3.0.5.32.19981102121408.00c96c30@pophost.arbortext.com> <363E08B6.2EB84756@locke.ccil.org> <363E2F22.4EE275D9@locke.ccil.org> Message-ID: <363F4252.4F27F6EB@locke.ccil.org> Stephen R. Savitzky wrote: > [T]he classic algorithm for traversing a tree is: > > traverse(node) { > visit(node); > if (node.firstChild != null) traverse(node.firstChild); > if (node.nextSibling != null) traverse(node.nextSibling); > } The trouble with that algorithm is that it is recursive. It will blow up if the tree is sufficiently deep. Indeed, in languages that cannot be relied on to do tail recursion, like Java, it will blow up if the tree is merely sufficiently wide. Furthermore, if there is any end-of-node processing to do, such as emitting an end tag indication, then the algorithm is no longer even partly tail recursive and will blow up on both depth and width even in safe-tail-recursion languages. The algorithm I use in DOMParser, therefore, is non-recursive: traverse(Node node) { Node currentNode = node; while (currentNode != null) { visit(currentNode); // Move down to first child Node nextNode = currentNode.getFirstChild(); if (nextNode != null) { currentNode = nextNode; continue; } // No child nodes, so walk tree while (currentNode != null) { revisit(currentNode) // do end-of-node processing, if any // Move to sibling if possible. nextNode = currentNode.getNextSibling(); if (nextNode != null) { currentNode = nextNode; break; } // Move up if (currentNode = node) currentNode = null; else currentNode = currentNode.getParentNode(); } } } Because of the reliability of this algorithm vis-a-vis the recursive one, I believe it should be the standard way of walking DOM trees, and therefore it is essential that DOM implementations make the structural access methods fast. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Tue Nov 3 17:54:58 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:06:11 2004 Subject: Specifying virtual fonts in XML for handling variant characters In-Reply-To: <000001be074e$e2ec19c0$dae887cb@NT.JELLIFFE.COM.AU> Message-ID: <000201be0752$f62b2d10$dae887cb@NT.JELLIFFE.COM.AU> Has anyone come up with a solution for specifying virtual ("synthetic") fonts in XML? I need more than just saying "Latin block uses font x, greek block uses font y", I need to be able to say "This character should use font x, that character should use font y". Has anyone come up with a standard way to markup which characters in the private-use block are being used. If the Maths people are using parts of the block, it probably would be a good idea to have some system whereby when our documents are merged your private-use area does not overlay my private use area. Does anyone know what the current status of webfonts is, and what the relation to Netscape's "Dynamic Fonts"? Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Nov 3 18:07:28 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:12 2004 Subject: CDATA by any other name... (was The raw and the cooked) References: <002901be0716$b5121a50$7008e391@bra01wmhkay.bra01.icl.co.uk> <000001be0720$918764f0$aee887cb@NT.JELLIFFE.COM.AU> <13886.61181.97569.898100@localhost.localdomain> Message-ID: <363F467F.3D832CF0@locke.ccil.org> David Megginson wrote: > I *do* see the usefulness in > combination with an "xml:lang" or "mime-type" attribute, though. An > intelligent editor could already act on xml:lang to limit character > selection, if such a thing were desirable. Such an editor would have to be a durn sight more intelligent than anything now available, because the repertoire of a language is a sticky wicket. In the domain of "xml:lang='en-US'", am I to be forbidden to write "na?ve" or "co?perate"? How about "r?sum?" or "Qu?b?c"? Harald Alvestrand worked for some years trying to nail down the repertoires (r?pertoires?) of various European languages. His latest (1995) draft at http://www.alvestrand.no/ietf/lang-chars.txt warns how incomplete the results still are. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Tue Nov 3 18:10:31 1998 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:06:12 2004 Subject: Specifying virtual fonts in XML for handling variant characters Message-ID: <005601be0754$94abffe0$0300000a@othniel.cygnus.uwa.edu.au> -----Original Message----- From: Rick Jelliffe >Has anyone come up with a solution for specifying virtual ("synthetic") >fonts in XML? > >I need more than just saying "Latin block uses font x, greek block uses font >y", I need to be able to say "This character should use font x, that >character should use font y". I actually need this for FOP[1]. FOP is taking in Unicode but outputting PDF using Type 1 fonts with AdobeStandardEncoding. James [1] http://www.jtauber.com/fop/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Tue Nov 3 18:16:56 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:06:12 2004 Subject: CDATA by any other name... (was The raw and the cooked) References: <000001be0720$918764f0$aee887cb@NT.JELLIFFE.COM.AU> Message-ID: <363F0962.C0F19885@technologist.com> Rick Jelliffe wrote: > > For example, I cannot see why a smart editor could not use the CDATA section > to cofine editing to whatever the repertoire of the character set of the > encoding attribute of the XML header says. Because it would be redundant. If the XML header says what characters are available then the editor can directly enforce that constraint. Overloading markup in this way is, in my opinion, a bad idea. It can do nothing but bring harm in the long term because no two applications will agree on the overloaded semantics and thus no two applications will treat the data in the same way. > The idea was not that JAVA would be a "CDATA marked section", > but an "RCDATA marked section", which means that special character > references and entity references would be allowed. This would eliminate the most interesting thing about CDATA sections: character suppression. Making ") and only looks for something like ]JAVA]> to end the section. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "I always wanted to be somebody, but I should have been more specific." --Lily Tomlin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Nov 3 18:20:44 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:12 2004 Subject: CDATA by any other name... (was The raw and the cooked) References: <000101be0751$1ee370c0$dae887cb@NT.JELLIFFE.COM.AU> Message-ID: <363F4951.E05D1C5D@locke.ccil.org> Rick Jelliffe wrote: > (An optimistic view of ISO10646: there are dozens of new Han ideographs > created every day, apart from other scripts.) True but irrelevant, since no specifiable character set can hold these. > I hope that the schema people will look at > a solution for constraining strings to use certain repertoires of > characters. And I hope that they allow no such thing, except perhaps as a fall-out from some regex or other local syntax mechanism. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Nov 3 18:24:39 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:12 2004 Subject: Apologies for misdirected mail Message-ID: <363F49FD.9285BD6D@locke.ccil.org> The two messages "Re: XML APIs" and "Walking the DOM" from me should have gone to the DOM mailing list (where they now have been sent) rather than to XML-DEV. My apologies to those who do not care about them DOM, and especially to those who will now see the messages twice. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Tue Nov 3 18:29:15 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:06:12 2004 Subject: Walking the DOM (was: XML APIs) Message-ID: <3.0.32.19981103102617.00afddb0@pop.intergate.bc.ca> At 12:50 PM 11/3/98 -0500, John Cowan wrote: >Stephen R. Savitzky wrote: > >> [T]he classic algorithm for traversing a tree is: >> traverse(node) { ... >> } > >The trouble with that algorithm is that it is recursive. It will >blow up if the tree is sufficiently deep. Indeed, in >languages that cannot be relied on to do tail recursion, like >Java, it will blow up if the tree is merely sufficiently wide. Wouldn't the effects of recursion will be lost in the static, compared to the effects of loading the doc into memory to facilitate tree processing? Even if you are doing some persistent-ancillary- info trick to do a virtual tree, in my experience for very large docs you really have to wrangle memory carefully. It seems really counter-intuitive that the stack & local variables overhead caused by recursion is going to get you before one of these other things. Unless of course you recurse in some huge sloppy badly-written routine with lots of local junk. BTW, what languages can be relied on to do tail recursion? Also, shorter algorithms are better. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Nov 3 18:32:43 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:12 2004 Subject: CDATA by any other name... (was The raw and the cooked) References: <000001be0720$918764f0$aee887cb@NT.JELLIFFE.COM.AU> <363F0962.C0F19885@technologist.com> Message-ID: <363F4BD1.16B2DC78@locke.ccil.org> Paul Prescod replied to Rick Jelliffe: > > The idea was not that JAVA would be a "CDATA marked section", > > but an "RCDATA marked section", which means that special character > > references and entity references would be allowed. I missed this before. Java doesn't need character references, for which it has its own syntax, and entity references would IMHO cause more confusion then they are worth, since & and < have well-known Java semantics utterly distinct from their SGML (reference) semantics. > This would eliminate the most interesting thing about CDATA sections: > character suppression. Making direction. Rather it should be a CDATA-on-steriods that even ignores CDEnd > ("]]>") and only looks for something like ]JAVA]> to end the section. Doesn't help, because the recursive problem remains vivid when Java programs generate SGML. Keep CDEnd and use "]]\76" inside Java strings, "]] >" in ordinary Java source. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From drabin at Adobe.COM Tue Nov 3 19:07:48 1998 From: drabin at Adobe.COM (Dan Rabin) Date: Mon Jun 7 17:06:12 2004 Subject: Walking the DOM (was: XML APIs) In-Reply-To: <3.0.32.19981103102617.00afddb0@pop.intergate.bc.ca> Message-ID: <3.0.5.32.19981103110605.00f35670@mail-345> At 10:27 AM 11/3/98 -0800, Tim Bray wrote: >BTW, what languages can be relied on to do tail recursion? Scheme can be so relied on (for sure), and Standard ML too (I think). -- Dan Rabin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Tue Nov 3 19:10:27 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:06:12 2004 Subject: CDATA by any other name... (was The raw and the cooked) In-Reply-To: <363F4951.E05D1C5D@locke.ccil.org> Message-ID: <000301be075d$a1376e30$dae887cb@NT.JELLIFFE.COM.AU> > From: John Cowan > Rick Jelliffe wrote: > > > (An optimistic view of ISO10646: there are dozens of new Han ideographs > > created every day, apart from other scripts.) > > True but irrelevant, since no specifiable character set can hold these. Not so. The additions are use composed of standard radicals and combinations. There are various projects around (such as C.C.Hsieh in Taiwan) to figure out encodings to "spell" Han ideographs by component radicals. This would allow any number of characters and even variant forms. But this is not in ISO 10646 yet. I guess the point is that John thinks that if an XML system can produce characters which a recipient system cannot process, because it does not use ISO 10646, that is not something that CDATA sections should be used to address. I think his reasons are that he cannot see it in the spec. Dave M thinks that xml:lang is appropriate. My point about CDATA elements was that there is no standard mechanism to lock CDATA marked sections. I think a lot of people now think that any non-ISO10646 system is for losers anyway (except for whatever character set they use, probably). > .. the repertoire of a language is > a sticky wicket. In the domain of "xml:lang='en-US'", am I to be > forbidden to write "na?ve" or "co?perate"? How about "r?sum?" or > "Qu?b?c"? The primary purpose of xml:lang, as far as I am concerned, should be to convey the information lost by ISO 10646 unification: where the Japanese and Chinese glyphs (or Polish and Russian) for a unified character differ, then I think transcoding and unifying the characters into ISO 10646 can lose information unless the xml:lang attribute is set. After that, xml:lang can be used to label text for the purposes of variant character selection, and after that for marking up the natural language. But I am not trying to fix the repertoire of a language (TEI WSD can declare it, though). I am just thinking about how to constrain XML documents so that they will not contain characters which will break non-ISO10646 target systems. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Nov 3 19:13:37 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:12 2004 Subject: Walking the DOM (was: XML APIs) References: <3.0.32.19981103102617.00afddb0@pop.intergate.bc.ca> Message-ID: <363F55AB.AEA4EB19@locke.ccil.org> Tim Bray wrote: > Wouldn't the effects of recursion will be lost in the static, > compared to the effects of loading the doc into memory to facilitate > tree processing? That produces slow processing, not a hard failure (unless indeed there is simply too much document for even virtual memory). Java, and all other HLLs I know of, provide no way to recover from stack overflow, short of starting the app all over again with a command-line switch for a bigger stack. A general-purpose routine ought not to generate a preventable hard failure no matter what the document looks like, IMHO. > BTW, what languages can be relied on to do tail recursion? Scheme and ML and their descendants. The Scheme version of Stephen's algorithm will detect the tail recursion, and will be recursive down the tree and iterative across it. Indeed, Scheme *has* no (primitive) way to do iteration except with tail recursion (there are macros that syntactically sugar this, if you want). As a result, Scheme compilers can concentrate on making the very few constructs they have to understand (function call, function closure, assignment, IF) very very efficient. > Also, shorter algorithms are better. -Tim But constant-space algorithms are better too. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Nov 3 20:40:43 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:12 2004 Subject: Unicode, xml:lang, and variant glyphs References: <000301be075d$a1376e30$dae887cb@NT.JELLIFFE.COM.AU> Message-ID: <363F657F.5D2E1B43@locke.ccil.org> Rick Jelliffe wrote: > Not so. The additions are use composed of standard radicals and > combinations. There are various projects around (such as C.C.Hsieh in > Taiwan) to figure out encodings to "spell" Han ideographs by component > radicals. I'm glad to hear about this; I find the IRG archives utterly impenetrable. > I guess the point is that John thinks that if an XML system can produce > characters which a recipient system cannot process, because it does not use > ISO 10646, that is not something that CDATA sections should be used to > address. I think his reasons are that he cannot see it in the spec. [...] > I think a lot > of people now think that any non-ISO10646 system is for losers anyway > (except for whatever character set they use, probably). Well, actually I would say the latter rationale has more effect on me than the former, if I must choose either. It just seemed to me that using CDATA sections to constrain the behavior of editors was not particularly user-friendly; if the user wants a character, let her have it, using a character reference if possible. In general, transcoding XML documents involves inserting NCRs as needed, unless the target is UTF-8 or UTF-16. > The primary purpose of xml:lang, as far as I am concerned, should be to > convey the information lost by ISO 10646 unification: where the Japanese and > Chinese glyphs Actually, the problem isn't that clearcut. As John Jenkins posted to the Unicode list last year: # FACT. It is true that some Unihan characters are typically written # differently within the Japanese, Taiwanese, Korean, and Mainland Chinese # typographic traditions. # # FACT. These differences of writing style are within the general range of # allowable differences within each typographic tradition. # # E.g., the official "Taiwanese" glyph for U+8349 ("grass") per ISO/IEC # 10646 uses four strokes for the "grass" radical, whereas the PRC, # Japanese, and Korean glyphs use three. As it happens, Apple's LiSung # Light font for Big Five (which follows the "Taiwanese" typographic # tradition) uses three strokes. # # (This is easily confirmed by accessing # http://www.unicode.org/unihan/unihan.acgi$8349.) # # FACT. Japanese users prefer to see Japanese text written with "Japanese" # glyphs. # # FACT. It is also acceptable to Japanese users to see Chinese text # written with "Japanese" glyphs. # # E.g., I just borrowed from Lee Collins a standard Japanese dictionary # which quotes Chinese authors (e.g., Mencius) to show how a character is # used. When doing so, they use "Japanese" glyphs, not Chinese ones. # # In particular, it is acceptable within Japanese typography for a small # stretch of Chinese quoted in a predominantly Japanese text to be written # with "Japanese" glyphs. # # FACT. Han unification allows for the possibility that a Japanese user # might be required to use a Chinese font to display some Japanese text # (e.g., if it uses a rare kanji). # # FACT. Ditto for JIS or an ISO 2022-based solution. # # FACT. Unicode doesn't include all the characters in actual use in Japan # today, particularly for personal names. # # FACT. Neither does JIS or an ISO 2022-based solution. There are vendor # sets which include many of these characters, and Unicode is working with # the IRG and East Asian national bodies to add them. > (or Polish and Russian) How's that again? Polish uses Latin, Russian uses Cyrillic! What could possibly count as a unification between these two?? *Nobody* thinks that LATIN LETTER A and CYRILLIC LETTER A should be unified.... > for a unified character differ, then > I think transcoding and unifying the characters into ISO 10646 can lose > information unless the xml:lang attribute is set. It doesn't lose information about meaning. It may make characters harder to read, but the distinction is one of typographic tradition, not language, and can cross languages. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jcw at equi4.com Tue Nov 3 21:40:30 1998 From: jcw at equi4.com (Jean-Claude Wippler) Date: Mon Jun 7 17:06:12 2004 Subject: [Fwd: Walking the DOM (was: XML APIs)] References: <363F54B8.6268F15@totten.com> Message-ID: <363F7816.DF478026@equi4.com> John Cowan wrote: > Stephen R. Savitzky wrote: > > > [T]he classic algorithm for traversing a tree is: > > > > traverse(node) { > > visit(node); > > if (node.firstChild != null) traverse(node.firstChild); > > if (node.nextSibling != null) traverse(node.nextSibling); > > } > > The trouble with that algorithm is that it is recursive. It will > blow up if the tree is sufficiently deep. Indeed, in > languages that cannot be relied on to do tail recursion, like > Java, it will blow up if the tree is merely sufficiently wide. > > Furthermore, if there is any end-of-node processing to do, such as > emitting an end tag indication, then the algorithm is no longer > even partly tail recursive and will blow up on both depth and > width even in safe-tail-recursion languages. > > The algorithm I use in DOMParser, therefore, is non-recursive: [...] The way I load an XML document into MetaKit, it uses an explicit stack with exactly one "int" per level. I think you'll agree that this amount of "stack" use makes the approach suitable for any document (once I add some tests - this is just an experiment for now). Source code is at: http://www.equi4.com/metakit/xml/mk4xml.cpp After that, you end up with an on-demand loaded document, which is indexable so there is no scanning at all when accessing this data. Every child node is in an indexable "subview". And when you *do* need traversal, you can again use the same one-int-per-level stack approach. This works equally well in the case of end-node processing, BTW. > Because of the reliability of this algorithm vis-a-vis the recursive > one, I believe it should be the standard way of walking DOM trees, > and therefore it is essential that DOM implementations make the > structural access methods fast. By reliability, do you mean "not blowing up its stack"? As you can see, there are more ways than one to skin this cat. It seems to me that standardizing in the way you propose will prevent the use of other techniques - such as storing XML as a MetaKit datafile and using explicit recursion. -- Jean-Claude ________________________________________________________________________ Jean-Claude Wippler MetaKit home page - http://www.equi4.com/metakit/ Equi4 Software "Portable database software for a changing world" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jcw at equi4.com Tue Nov 3 21:41:24 1998 From: jcw at equi4.com (Jean-Claude Wippler) Date: Mon Jun 7 17:06:12 2004 Subject: [Fwd: Walking the DOM (was: XML APIs)] References: <363F54B8.6268F15@totten.com> Message-ID: <363F7824.EE88DDDB@equi4.com> John Cowan wrote: > Stephen R. Savitzky wrote: > > > [T]he classic algorithm for traversing a tree is: > > > > traverse(node) { > > visit(node); > > if (node.firstChild != null) traverse(node.firstChild); > > if (node.nextSibling != null) traverse(node.nextSibling); > > } > > The trouble with that algorithm is that it is recursive. It will > blow up if the tree is sufficiently deep. Indeed, in > languages that cannot be relied on to do tail recursion, like > Java, it will blow up if the tree is merely sufficiently wide. > > Furthermore, if there is any end-of-node processing to do, such as > emitting an end tag indication, then the algorithm is no longer > even partly tail recursive and will blow up on both depth and > width even in safe-tail-recursion languages. > > The algorithm I use in DOMParser, therefore, is non-recursive: [...] The way I load an XML document into MetaKit, it uses an explicit stack with exactly one "int" per level. I think you'll agree that this amount of "stack" use makes the approach suitable for any document (once I add some tests - this is just an experiment for now). Source code is at: http://www.equi4.com/metakit/xml/mk4xml.cpp After that, you end up with an on-demand loaded document, which is indexable so there is no scanning at all when accessing this data. Every child node is in an indexable "subview". And when you *do* need traversal, you can again use the same one-int-per-level stack approach. This works equally well in the case of end-node processing, BTW. > Because of the reliability of this algorithm vis-a-vis the recursive > one, I believe it should be the standard way of walking DOM trees, > and therefore it is essential that DOM implementations make the > structural access methods fast. By reliability, do you mean "not blowing up its stack"? As you can see, there are more ways than one to skin this cat. It seems to me that standardizing in the way you propose will prevent the use of other techniques - such as storing XML as a MetaKit datafile and using explicit recursion. -- Jean-Claude ________________________________________________________________________ Jean-Claude Wippler MetaKit home page - http://www.equi4.com/metakit/ Equi4 Software "Portable database software for a changing world" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Tue Nov 3 22:46:59 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:06:12 2004 Subject: DTD,s Message-ID: <199811032243.OAA23029@boethius.eng.sun.com> [Frank Boumphrey:] | Does any one know of a site where XML dtd's are available for general | use? OASIS (http://www.oasis-open.org) is slowly gearing up to provide a DTD registry. This will no doubt be discussed at the OASIS meeting in Chicago November 15. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Tue Nov 3 23:34:00 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:06:12 2004 Subject: Last Call issued on initial stylesheet linking draft Message-ID: <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> Posting this on behalf the Syntax WG; xml-dev is obviously highly qualified to provide feedback. Cover letter is from Syntax WG co-chair Joel Nava =================================================================== The XML Syntax Working Group of the W3C is issuing a "Last Call" for comments on the specification "Associating stylesheets with XML documents - version 1.0" http://www.w3.org/TR/WD-xml-stylesheet Please review the document and send any comments you have to jjc@jclark.com, tbray@textuality.com, jnava@adobe.com. Comments are due by Friday Nov. 20th. To save bandwidth, I am including some rationale for the specific syntax we are using in this specification. As you will notice, the Working Group has chosen to use a special processing instruction, or PI, to link an XML document to stylesheets. Some wonder whether an element or attribute based solution somewhere along the lines of XLink or the XML Namespace mechanism would be more appropriate. The reasons for our choice are 3-fold: First, for the most part the working group feels that this syntax is the best for the problem. Many argue that this is the proper use for a PI, because it keeps this information out of the document tree. Second, timing is an issue. The mechanism that we have produced is very similar to the HTML link element, and was agreed upon many months prior to the formation of this WG. The XML Style Sheet Linking Specification was a partially completed work item from the old XML WG. In the intervening time between the end of that group and the beginning of this group the time to make an impact on the next release from the browser vendors was slipping away. We have tried to move quickly to complete this specification in order to have an impact on what gets implemented. Both Microsoft and Netscape have agreed on this syntax, and as far as we can guess will be shipping products based on it in the near future. The third part of this is the fact that time and resources have already been put aside to produce a Version 2 of this specification. "Associating stylesheets with XML documents - version 2.0" will add other mechanisms for linking style to XML document; see http://www.w3.org/XML/Activity.html#future -- Joel A. Nava (408)536-6209 Adobe Systems, Inc. jnava@adobe.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Tue Nov 3 23:42:49 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:06:12 2004 Subject: Last call - correction Message-ID: <3.0.32.19981103154228.009fc660@pop.intergate.bc.ca> [oops - date error in previous posting] >> Comments are due by Friday Nov. 20th. should read Comments are due by Tuesday Nov. 17th -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Wed Nov 4 00:42:40 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:06:12 2004 Subject: Walking the DOM (was: XML APIs) References: <3.0.32.19981103102617.00afddb0@pop.intergate.bc.ca> Message-ID: <363F9E39.D978D55F@technologist.com> Tim Bray wrote: > > >The trouble with that algorithm is that it is recursive. It will > >blow up if the tree is sufficiently deep. Indeed, in > >languages that cannot be relied on to do tail recursion, like > >Java, it will blow up if the tree is merely sufficiently wide. > > Wouldn't the effects of recursion will be lost in the static, > compared to the effects of loading the doc into memory to facilitate > tree processing? Even if you are doing some persistent-ancillary- > info trick to do a virtual tree, in my experience for very large > docs you really have to wrangle memory carefully. But the persistent ancillary-info trick (i.e. "object database") keeps only the data it needs to in memory. If it requires lots of swapping, that slows things down, but the algorithm works nevertheless. If you blow your stack, you blow your stack, and there is no database in the world that will help you. Depending on the algorithm, walking an object database tree for a really huge file may be faster than parsing it and event-processing it. It depends on how many nodes you are actually processing, and how much ancillary info you must keep around to solve the problem you need to solve. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "I always wanted to be somebody, but I should have been more specific." --Lily Tomlin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From phani at www.hsc.wvu.edu Wed Nov 4 02:42:01 1998 From: phani at www.hsc.wvu.edu (Phani Adabala) Date: Mon Jun 7 17:06:12 2004 Subject: xml parser Message-ID: 1.To develop a search engine for xml documents, can we use the xml parser already developed by microsoft and others or do we need to build our own parser? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Wed Nov 4 09:18:38 1998 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:06:13 2004 Subject: CDATA by any other name... (was The raw and the cooked) Message-ID: <01BE07DB.D4272500@grappa.ito.tu-darmstadt.de> John Cowan wrote: > Rick Jelliffe wrote: > > > I hope that the schema people will look at > > a solution for constraining strings to use certain repertoires of > > characters. > > And I hope that they allow no such thing, except perhaps as a fall-out > from some regex or other local syntax mechanism. Why not? This would be very useful for constraining what can be put into a database, many (most?) of which do not support Unicode. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Wed Nov 4 09:43:40 1998 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:06:13 2004 Subject: Last Call issued on initial stylesheet linking draft Message-ID: <01BE07DF.52B752C0@grappa.ito.tu-darmstadt.de> The second S in StylesheetPI and both S's in PseudoAtt should be optional. Even the examples don't include them. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Wed Nov 4 09:45:35 1998 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 17:06:13 2004 Subject: Walking the DOM (was: XML APIs) In-Reply-To: <363F9E39.D978D55F@technologist.com> References: <3.0.32.19981103102617.00afddb0@pop.intergate.bc.ca> Message-ID: <3.0.6.32.19981104093649.0095e800@gpo.iol.ie> [Paul Prescod] >But the persistent ancillary-info trick (i.e. "object database") keeps >only the data it needs to in memory. If it requires lots of swapping, that >slows things down, but the algorithm works nevertheless. If you blow your >stack, you blow your stack, and there is no database in the world that >will help you. > >Depending on the algorithm, walking an object database tree for a really >huge file may be faster than parsing it and event-processing it. It >depends on how many nodes you are actually processing, and how much >ancillary info you must keep around to solve the problem you need to >solve. > In my experience, there is a strong "principle of locality" in XML/SGML processing. I find I can get by quite happily with mini-tree structures harvested at suitable points from a larger document processed event-style. In my Python toolkit for SGML/XML processing I added support for sparse tree building some time ago and I find myself using it more and more. This is certainly far easier to do that implement virtual tree access with swapping to disk etc. You don't need no object database either:-) http://www.python.org The "Swiss Army Laser Beam" of programming languages xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Wed Nov 4 09:58:02 1998 From: ht at cogsci.ed.ac.uk (Henry S. Thompson) Date: Mon Jun 7 17:06:13 2004 Subject: Hybrid event/tree interfaces (was: Walking the DOM (was: XML APIs)) In-Reply-To: Sean Mc Grath's message of "Wed, 04 Nov 1998 09:36:49 +0000" References: <3.0.32.19981103102617.00afddb0@pop.intergate.bc.ca> <3.0.6.32.19981104093649.0095e800@gpo.iol.ie> Message-ID: Sean Mc Grath writes: > In my experience, there is a strong "principle of locality" in > XML/SGML processing. I find I can get by quite happily with > mini-tree structures harvested at suitable points from > a larger document processed event-style. In my > Python toolkit for SGML/XML processing I added support > for sparse tree building some time ago and I find myself > using it more and more. > > This is certainly far easier to do that implement virtual > tree access with swapping to disk etc. You don't need > no object database either:-) Our experience is very much in agreement with this. We have been using an API [1] which allows you to switch from event to tree(-fragment) view for the last few years, and it is a very productive way to go. You can think of it as allowing you to loop over nodes in a document which match a query in a restricted query language, restricted in that you can only query properties (e.g. tag name, attribute values) of candidate nodes and their ancestors: no descendents or siblings. If you like what you see, THEN you can ask for the whole subtree rooted in that node, and do whatever you like with it. It should be clear that a simple implementation of this is possible, which only needs to keep a stack of current ancestor start-tags. The result is no upper bound on document size: we regularly push 2GB of XML through a chain of filters implemented in this way. ht [1] http://www.ltg.ed.ac.uk/software/xml/ -- Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk URL: http://www.ltg.ed.ac.uk/~ht/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Wed Nov 4 11:00:01 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:06:13 2004 Subject: xml parser Message-ID: <004701be07e1$9290ed00$7008e391@bra01wmhkay.bra01.icl.co.uk> > >1.To develop a search engine for xml documents, can we use the xml parser >already developed by microsoft and others or do we need to build our own >parser? My immediate answer to this is yes, all the information you need for a search engine is available via the SAX or DOM interface offered by many parsers. This is certainly true for the indexing phase; for displaying hit documents I can think of some requirements that a standard parser might not meet, such as displaying the text around a search term without parsing the whole document. So it depends on your detailed design. But in any case many XML parsers are available with source code so you shouldn't need to write a new one from scratch. Of course you don't need to build your own search engine either, all you need to do is write an XML filter for an existing search engine. I'm surprised no-one seems to have done this yet. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Wed Nov 4 11:14:03 1998 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:06:13 2004 Subject: xml parser Message-ID: <01BE07EB.ED7FB070@GRAPPA> Phani Adabala wrote: > 1.To develop a search engine for xml documents, can we use the xml parser > already developed by microsoft and others or do we need to build our own > parser? You definitely don't need to write your own parser -- there are plenty available, including Microsoft's. See, for example: http://www.xmlsoftware.com/parsers/ http://www.oasis-open.org/cover/xml.html#xmlSoftware It would also be a good idea to write your software in a parser-independent way, using SAX (http://www.megginson.com/SAX/) or DOM (http://www.w3.org/TR/REC-DOM-Level-1/). The former is an event-driven interface, the latter is a tree interface. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rja at dip.co.uk Wed Nov 4 11:35:55 1998 From: rja at dip.co.uk (Richard James Anderson) Date: Mon Jun 7 17:06:13 2004 Subject: xml parser Message-ID: <000101be07e7$5c3e2910$c5010180@p197> Hi, For those who are interested, I've posted an early version of my ActiveX SAX control up on my website ( URL below ). The control still has a long way to go, but it can parse most files that do not contain references to external entities. The download includes a sample VB6 app for reading and processing XML files. It just loads the XML file into a tree control, and shows the SAX events in a list control. Of course, the control can be used to anything that supports COM automation controllers. Enjoy, Richard. RJA@DIP.CO.UK http://www.arpsolutions.demon.co.uk *** The text contained within this message is of a personal nature that does not reflect the development of opinions of data interchange plc unless specifically stated *** xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Nov 4 11:49:30 1998 From: david at megginson.com (david@megginson.com) Date: Mon Jun 7 17:06:13 2004 Subject: CDATA by any other name... (was The raw and the cooked) In-Reply-To: <01BE07DB.D4272500@grappa.ito.tu-darmstadt.de> References: <01BE07DB.D4272500@grappa.ito.tu-darmstadt.de> Message-ID: <13888.15941.518550.194555@localhost.localdomain> Ronald Bourret writes: > Why not? This would be very useful for constraining what can be > put into a database, many (most?) of which do not support Unicode. There are three, much better choices for specific problems like this: 1. Have the application throw an error if an out-of-range character appears. 2. Convert the text to UTF-8 before storing it in the database (UTF-8 and ASCII are identical up to 0x7f) 3. Escape non-ASCII characters with character references before storing the text in the database. As I mentioned before, it's always better to be explicit about this kind of thing -- syntactic subtlety is a bad thing. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Wed Nov 4 14:03:02 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:06:13 2004 Subject: Walking the DOM (was: XML APIs) References: <3.0.32.19981103102617.00afddb0@pop.intergate.bc.ca> <3.0.6.32.19981104093649.0095e800@gpo.iol.ie> Message-ID: <364059E7.4E2067AB@technologist.com> Sean Mc Grath wrote: > > In my experience, there is a strong "principle of locality" in > XML/SGML processing. I find I can get by quite happily with > mini-tree structures harvested at suitable points from > a larger document processed event-style. In my > Python toolkit for SGML/XML processing I added support > for sparse tree building some time ago and I find myself > using it more and more. As long as you "get by", more power to you. But what happens when you hit a document where the first paragraph makes a cross reference to the last paragraph and the last paragraph makes a reference to somewhere in the middle? You can hack around it (after all, some people get away with using Omnimark!), but you will be hacking. You could also hack around local tree access. Given the choice of hacking around one or the other, I would rather hack around local references, because those are more predictable. > This is certainly far easier to do that implement virtual > tree access with swapping to disk etc. You don't need > no object database either:-) That's true, but it doesn't scale to the full generality of problems. As long as you can get away with it, do so, but I know that I have problems that require the Full Monty. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "I always wanted to be somebody, but I should have been more specific." --Lily Tomlin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Nov 4 14:54:52 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:13 2004 Subject: Walking the DOM (was: XML APIs) References: <3.0.32.19981103102617.00afddb0@pop.intergate.bc.ca> <363F9E39.D978D55F@technologist.com> Message-ID: <36406AF7.4C8A6CD6@locke.ccil.org> Paul Prescod wrote: > If it requires lots of swapping, that > slows things down, but the algorithm works nevertheless. If you blow your > stack, you blow your stack, and there is no database in the world that > will help you. What he said. Hence the desirability of a non-stack-based algorithm. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msabin at cromwellmedia.co.uk Wed Nov 4 15:17:37 1998 From: msabin at cromwellmedia.co.uk (Miles Sabin) Date: Mon Jun 7 17:06:13 2004 Subject: Interface name quandry ... Message-ID: Apologies in advance if this is a bit off topic, and apologies to those who get multiple copies. I'm working on a number of Java APIs which operate on documents and their DOM representations relying on only the intersection of the properties of XML and HTML, and I've been racking my brains for a good name that covers both HTML and XML, but isn't as general as SGML. Has anybody got any suggestions? Cheers, Miles -- Miles Sabin Cromwell Media Internet Systems Architect 5/6 Glenthorne Mews +44 (0)181 410 2230 London, W6 0LJ msabin@cromwellmedia.co.uk England xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Nov 4 15:26:38 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:13 2004 Subject: Last Call issued on initial stylesheet linking draft References: <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> Message-ID: <36407254.FF6816DD@locke.ccil.org> Joel Nava wrote: > To save bandwidth, I am including some rationale for the > specific syntax we are using in this specification. I believe that a lightly edited version of this rationale should be included as an (informative) appendix to the recommendation. Otherwise I believe people will see the rec as unmotivated and will ignore it. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sroth at radsys.com Wed Nov 4 15:28:53 1998 From: sroth at radsys.com (Roth, Scott) Date: Mon Jun 7 17:06:13 2004 Subject: Text file to XML?? Message-ID: <5FAFB2A5D7B2D111ACEA0060972027CE186366@RADSYS_EXCH> Help....??? I am working on a way to take delimited text file that has data in it and break that up so that I can make files that hold xml data within it. The text file holds metadata already that points to files and holds certain key information. What I want to do is take that data and have it put the proper xml tags in where the fields are and then take that data and put it also into the file. Then I want to add the proper HTML. Does this make sense??? Please help me out I want to know if anybody out there has done this already so I don't have to reinvent the wheel. And if anyone has any helpful hints please let me know. Thanks, Scott Roth xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Nov 4 15:49:19 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:13 2004 Subject: CDATA by any other name... (was The raw and the cooked) References: <01BE07DB.D4272500@grappa.ito.tu-darmstadt.de> Message-ID: <364077B9.AB1AA3A6@locke.ccil.org> Ronald Bourret wrote: > Why not? This would be very useful for constraining what can be put > into a database, many (most?) of which do not support Unicode. Because I thought XML (and Unicode) were in the business of enabling, not constraining. Lack of support for anything but Western Europe is an unfortunate misfeature to be worked around. Perhaps the routines that will later read from the database (not in XML, I assume) can be taught to understand HCRs. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Nov 4 16:00:55 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:13 2004 Subject: Interface name quandry ... References: Message-ID: <36407A63.DF462A3F@locke.ccil.org> Miles Sabin wrote: > I've been racking my brains for a > good name that covers both HTML and XML, but isn't > as general as SGML. > > Has anybody got any suggestions? XHTML has been used on this list, and I think is well understood. Some people have also used the term "HTML 5.0" :-) -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tug at wilson.co.uk Wed Nov 4 16:15:29 1998 From: tug at wilson.co.uk (John Wilson) Date: Mon Jun 7 17:06:14 2004 Subject: Parsing XML for direct use by programs Message-ID: <020d01be080e$3ebfe3c0$010a0a0a@bach.wilson.co.uk> Programs that consume XML generally have to use a two stage process: 1/ Parse the XML 2/ Create new objects to represent the data in the document (Here I'm thinking of things like EDI applications rather than of XML browsers. In these cases I have elements which represent dates, amounts of money, par number, etc. and I need to turn them into the appropriate internal data structures before my program can process them.) I have lots of support for step 1 but little or no support for step 2 (I'll address Bill la Forge's Coins system latter) What I think I would find helpful is a system which would let me describe how an XML document which corresponds to a given DTD be converted into an instances of a particular objects in my particular programming language. Of, course I'd like to describe this in XML! To take a concrete example: There are several ways of expressing a date in various DTDs in use now. In my Java program I want to deal with instances of java.util.Date. I don't want to encumber my program with all the hand crafted tedious detail of turning the XML element into an instance of java.util.Date by hand. I do want a standard package that reads a DTD an DTD->Java Object mapping description and an XML document and spits out the object tree that is understood by my program, not a DOM tree. Now, as I understand it, Coins can sort of do this but the designer of the DTD has really to take Coins into account at the beginning. This isn't what I want to do at all. I want the same DTD to be combined with different mapping descriptions to produce different object trees and I want different DTDs to be combined with different mapping descriptions to provide the same object tree. Is anybody working on this? Is it feasible? Is it useful? John Wilson The Wilson Partnership 5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK +44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax) Mailto: tug@wilson.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rhanson at blast.net Wed Nov 4 16:20:05 1998 From: rhanson at blast.net (Robert Hanson) Date: Mon Jun 7 17:06:14 2004 Subject: xml parser Message-ID: <000e01be080e$d8728220$12b919ce@Bertha> I downloaded your control, and have some questions... 1. Why can't I get it to work? Ok, that was only one question, but a serious one. I'm am trying to add Perl to your list of "Tested with", but ran into some problems. Below is the code I used: 1. use Win32::OLE; 2. 3. $parser = Win32::OLE->new('SAX.SAXParser') or die $!; 4. $parser->parseFile('c:\winn95\desktop\test.xml') or die $!; 5. undef $parser; 6. 7. sub characters 8. { 9. my ($sCharacter, $iLength) = @_; 10. print "$sCharacter\n\n"; 11. } It seems to be able to create the SAX object in line 3, but dies on line 4 with the parseFile method. Is there anyway to get an error from the parser to see what the problem is... maybe a getLastError method? If I get a chance, I may also try it out with PerlScript (or VBScript) in ASP later this week. ...If you (or anyone else) have any other ideas on getting this to work, please let me know. Many thanks, Robert -----Original Message----- From: Richard James Anderson To: XMLDEV Date: Wednesday, November 04, 1998 6:36 AM Subject: RE: xml parser >Hi, > >For those who are interested, I've posted an early version of my ActiveX SAX >control up on my website ( URL below ). > >The control still has a long way to go, but it can parse most files that do >not contain references to external entities. > >The download includes a sample VB6 app for reading and processing XML files. >It just loads the XML file into a tree control, and shows the SAX events in >a list control. Of course, the control can be used to anything that >supports COM automation controllers. > >Enjoy, > >Richard. > >RJA@DIP.CO.UK >http://www.arpsolutions.demon.co.uk > >*** The text contained within this message is of a personal nature that does >not reflect the development of opinions of data interchange plc unless >specifically stated *** > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Nov 4 16:23:10 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:06:14 2004 Subject: xml parser Message-ID: <3.0.32.19981104081951.00b61b10@pop.intergate.bc.ca> At 10:55 AM 11/4/98 -0000, Michael Kay wrote: >My immediate answer to this is yes, all the information you need for a >search engine is available via the SAX or DOM interface offered by many >parsers. I disagree. Few parsers track byte offsets or other locational info in the file, and I think you need that to do basic things like proximity and phrase search. >Of course you don't need to build your own search engine either, all you >need to do is write an XML filter for an existing search engine. I'm >surprised no-one seems to have done this yet. I think you do need to build your own engine. Reason is, most existing search engines have an atomic-document view of the world, and break down completely when asked to model a general recursive hierarchical structure like XML. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rja at dip.co.uk Wed Nov 4 16:39:59 1998 From: rja at dip.co.uk (RJA) Date: Mon Jun 7 17:06:14 2004 Subject: xml parser Message-ID: <000101be0811$dbde7c90$c5010180@p197> >1. Why can't I get it to work? Lets try to find out. >Is there anyway to >get an error from the parser to see what the problem is... maybe a >getLastError method? If I get a chance, I may also try it out with >PerlScript (or VBScript) in ASP later this week. The error interface has not been exposed in the control yet. I'll be doing that as soon as I get some spare time ( asap ). I'll download the python compiler and try your sample. The SAX events are currently being fired using standard COM connection points, but the event interface is IUnknown based, not IDispatch. Maybe thats a problem with Python ? I'll let you know. Regards, Richard. mailto://RJA@DIP.CO.UK http://www.arpsolutions.demon.co.uk *** The text contained within this message is of a personal nature that does not reflect the development of opinions of data interchange plc unless specifically stated *** xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Wed Nov 4 16:49:41 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:06:14 2004 Subject: CDATA by any other name... (was The raw and the cooked) References: <01BE07DB.D4272500@grappa.ito.tu-darmstadt.de> <364077B9.AB1AA3A6@locke.ccil.org> Message-ID: <36407FC0.D8919AA7@technologist.com> John Cowan wrote: > > Because I thought XML (and Unicode) were in the business of enabling, > not constraining. XML DTDs are in the business of constraining people to the data models and data that the software is expecting/can deal with. I don't see any big difference between saying: "This content must be restricted to this set of characters" and "this content must be a NMTOKEN or base-64 encoded." Nevertheless, this is clearly a schema problem and CDATA sections seem to me to be a really bad tool for enforcing this distinction. No editor vendor is going to support that use for them so it is a moot point. Paul Prescod - http://itrc.uwaterloo.ca/~papresco The United Nations Declaration of Human Rights will be 50 years old on December 10, 1998. These are your fundamental rights: http://www.udhr.org/history/default.htm xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Nov 4 17:00:00 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:14 2004 Subject: CDATA by any other name... (was The raw and the cooked) References: <01BE07DB.D4272500@grappa.ito.tu-darmstadt.de> <364077B9.AB1AA3A6@locke.ccil.org> <36407FC0.D8919AA7@technologist.com> Message-ID: <36408848.12498760@locke.ccil.org> Paul Prescod wrote: > XML DTDs are in the business of constraining people to the data models and > data that the software is expecting/can deal with. I don't see any big > difference between saying: "This content must be restricted to this set of > characters" and "this content must be a NMTOKEN or base-64 encoded." Put that way, I suppose you are right. As I said before, this could and should be handled as a special case of "The character data of this element must conform to the following regular expression." > Nevertheless, this is clearly a schema problem and CDATA sections seem to > me to be a really bad tool for enforcing this distinction. Particularly because it would mean that the charset of an XML document would become part of its schema: a document in US-ASCII can have only ASCII in its CDATA sections, but if it were transcoded to ShiftJIS, then it could have any JIS X 208 character in the CDATA section. So this means that transcoding arbitrary XML documents *requires* parsing them, because if you are reducing the repertoire, you may need to break up CDATA sections, and you cannot (?) recognize a CDATA section reliably without parsing. (In particular, what looks like a CDATA section start/end could appear as an attribute value, PI data, or comment.) An interesting side effect! -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clovett at microsoft.com Wed Nov 4 17:40:52 1998 From: clovett at microsoft.com (Chris Lovett) Date: Mon Jun 7 17:06:14 2004 Subject: CDATA by any other name... (was The raw and the cooked) Message-ID: <2F2DC5CE035DD1118C8E00805FFE354C08743F15@RED-MSG-56> I like Rick's idea of xml:content-mode="CDATA". This definitely disambiguates this whitespace case for the validating parser. So ]> would become: ]> It's true that a non-validating parser will have difficulty with the latter example, but that is solved by putting the xml:content-mode attribute on the instance. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From avirr at LanMinds.Com Wed Nov 4 18:30:48 1998 From: avirr at LanMinds.Com (Avi Rappoport) Date: Mon Jun 7 17:06:14 2004 Subject: xml parser In-Reply-To: <3.0.32.19981104081951.00b61b10@pop.intergate.bc.ca> Message-ID: At 8:22 AM -0800 11/4/98, Tim Bray wrote: > At 10:55 AM 11/4/98 -0000, Michael Kay wrote: > >My immediate answer to this is yes, all the information you need for a > >search engine is available via the SAX or DOM interface offered by many > >parsers. > > I disagree. Few parsers track byte offsets or other locational info in > the file, and I think you need that to do basic things like proximity > and phrase search. What Tim said. Most search engines do not have database storage, they have a fairly simple inverted index. Trying to put all the XML info in there would overload them. The point of having an XML search is to have metadata and context, so you probably need to use some of the more sophisticated text retrieval and library systems. BTW, I'm trying to collect information on XML and search, so please keep me posted if you are working on something. I post everything I hear about at Avi ________________________________________________________________ Avi Rappoport, Web Site Search Tools Maven: Guide to Site Indexing and Local Search Engines: xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Wed Nov 4 20:18:13 1998 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:06:14 2004 Subject: W3 DOM tutorial. Message-ID: <000701be0830$133a68c0$16afdccf@ix.netcom.com> (cross posted to xml-dev) For those who may be interested, now that Microsoft have released the new version of their IE5 beta, I have posted a new tutorial on the DOM at www.hypermedic.com. Follow the DOM links. Regards Frank Frank Boumphrey XML and style sheet info at Http://www.hypermedic.com/style/index.htm Author: - Professional Style Sheets for HTML and XML http://www.wrox.com CoAuthor: Professional XML applications form Wrox Press, www.wrox.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Suli.Ding at geis.ge.com Wed Nov 4 22:23:17 1998 From: Suli.Ding at geis.ge.com (Ding, Suli (GEIS)) Date: Mon Jun 7 17:06:14 2004 Subject: Text file to XML?? Message-ID: Scott, Have you check out this URL http://www.geocities.com/SiliconValley/Platform/4871/ Regards, Suli > ---------- > From: Roth, Scott[SMTP:sroth@radsys.com] > Reply To: Roth, Scott > Sent: Wednesday, November 04, 1998 10:28 AM > To: XML Dev Mailing (E-mail) > Subject: Text file to XML?? > > Help....??? > > I am working on a way to take delimited text file that has data in it and > break that up so that I can make files that hold xml data within it. The > text file holds metadata already that points to files and holds certain > key > information. What I want to do is take that data and have it put the > proper > xml tags in where the fields are and then take that data and put it also > into the file. Then I want to add the proper HTML. Does this make > sense??? > > Please help me out I want to know if anybody out there has done this > already > so I don't have to reinvent the wheel. And if anyone has any helpful > hints > please let me know. > > Thanks, > > Scott Roth > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Nov 4 22:27:45 1998 From: david at megginson.com (david@megginson.com) Date: Mon Jun 7 17:06:14 2004 Subject: Interface name quandry ... In-Reply-To: References: Message-ID: <13888.54350.64691.396015@localhost.localdomain> [cross-postings removed] Miles Sabin writes: > I'm working on a number of Java APIs which operate > on documents and their DOM representations relying > on only the intersection of the properties of XML > and HTML, and I've been racking my brains for a > good name that covers both HTML and XML, but isn't > as general as SGML. > > Has anybody got any suggestions? I think that the DOM calls these the "fundamental" node types, but I don't have the REC in front of me right now to check. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Nov 4 22:33:58 1998 From: david at megginson.com (david@megginson.com) Date: Mon Jun 7 17:06:14 2004 Subject: SAX, DOM, and Search Engines (was Re: xml parser) In-Reply-To: <3.0.32.19981104081951.00b61b10@pop.intergate.bc.ca> References: <3.0.32.19981104081951.00b61b10@pop.intergate.bc.ca> Message-ID: <13888.54487.434062.193573@localhost.localdomain> Tim Bray writes: > At 10:55 AM 11/4/98 -0000, Michael Kay wrote: > >My immediate answer to this is yes, all the information you need for a > >search engine is available via the SAX or DOM interface offered by many > >parsers. > > I disagree. Few parsers track byte offsets or other locational info in > the file, and I think you need that to do basic things like proximity > and phrase search. I disagree. While byte offsets might be useful for other purposes, they would be inappropriate for proximity and phrase searches -- for those, you need to track the relative positions of words, not their absolute positions. Consider the following example:

WORD1 &x; WORD2

Is WORD1 close to WORD2? It's only five bytes away (assuming an 8-bit encoding), but might be separated by 20,000 words, depending on what &x; expands to. SAX and the DOM do give you enough information to determine the relative positions of words. Byte offsets would be helpful for displaying context around a match, but there would be no 100% reliable way to format that context without starting from the top of the document, in which case an XPOINTER (also derivable from SAX or DOM) might be more helpful unless you want the search engine to display raw XML markup for the context. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Nov 4 23:08:59 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:06:15 2004 Subject: SAX, DOM, and Search Engines (was Re: xml parser) Message-ID: <3.0.32.19981104150834.00b4f100@pop.intergate.bc.ca> At 05:32 PM 11/4/98 -0500, david@megginson.com wrote: >Tim Bray writes: > > I disagree. Few parsers track byte offsets or other locational info in > > the file, and I think you need that to do basic things like proximity > > and phrase search. > >I disagree. While byte offsets might be useful for other purposes, >they would be inappropriate for proximity and phrase searches -- for >those, you need to track the relative positions of words, not their >absolute positions. Consider the following example: > >

WORD1 &x; WORD2

>Is WORD1 close to WORD2? Clearly, the proximity tests have to work in terms of proximity in the cooked, not raw, text. Lark carefully tracks offsets in terms of the entity stack so you can do this. But that's so obvious I don't think it's your point. Secondly, for proximity, you're worried about counting characters, not bytes, but for addressing back into the entity, you're worried about byte, not character, offsets. So it's even harder than it looks. Unless of course you're using UTF16 and staying in the BMP - which might be a REAL good idea in an IR-oriented system anyhow. > It's only five bytes away (assuming an 8-bit >encoding), but might be separated by 20,000 words, depending on what >&x; expands to. SAX and the DOM do give you enough information to >determine the relative positions of words. [warning: simple argument with long embedded digression] I don't think so. How about languages, such as those spoken by the majority of the world's inhabitants, that do not separate words with spaces? (Identifying word breaks in running Japanese or Chinese text is essentially a strong-AI problem. You can get decent results by running a dictionary and searching at each character break for a match, with morphological heuristics, but it turns out that in those languages there is sufficient encoding redundancy that you get pretty good results (at a cost of some space wasteage) just treating most characters as words - and lurking in that fact there's a PhD in linguistics for someone - but I digress, I spent a long time in those particular mines). But spotting "words" may not matter. In fact, I am not aware of any research that shows word proximity to be a better information retrieval heuristic than character proximity. And it's much easier to nail down what you mean by "character" than "word", and thus get deterministic cross-language behavior. >Byte offsets would be helpful for displaying context around a match, >but there would be no 100% reliable way to format that context without >starting from the top of the document unless you used the whizzy new soon-to-arrive W3C fragment packager, right? Actually, if you have an index that can understand the the structure well enough to support xpointer-flavor querying, the engine is going to know all the context info, so this should actually work pretty well (but only if you know the byte/character offsets). And the right way to display results in context depends on whether you're sampling, or visiting match. OK, you've been warned... if you get me going on the problems of searching in tagged internationalized text, bring a windbreaker - you'll need it. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From salur at csee.wvu.edu Wed Nov 4 23:23:38 1998 From: salur at csee.wvu.edu (Salur Prashanth) Date: Mon Jun 7 17:06:15 2004 Subject: XML Search Engine Message-ID: Hi all, Can anyone tell me where the difference lies in implementing a search engine for HTML and a search engine for XML. Thanks Salur. @#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@# Address: -------- Prashanth Kumar Salur, Apt# 910-1, Graduate Student, CSEE, 445 Oakland Street, West Virginia University Morgantown,WV-26505 Off. Ph: 304 293 6371 Ext 577 Res Ph: 304 598 8025 @#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@#@# xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ralph at fsc.fujitsu.com Thu Nov 5 00:02:32 1998 From: ralph at fsc.fujitsu.com (Ralph Ferris) Date: Mon Jun 7 17:06:15 2004 Subject: HyBrick V0.8 with XLink/XPointer is now Available Message-ID: <3.0.5.32.19981105065829.00956a80@pophost.fsc.fujitsu.com> All, The latest version of Fujitsu's "HyBrick" browser, V0.8, with support for XLink/XPointer, is now available from Fujitsu's Web site: http://www.fujitsu.co.jp/hypertext/free/HyBrick/download2.html The browser and supporting documentation can be downloaded by clicking on hb.08.exe. This is a Japanese-language site, so much of the supporting documentation won't be accessible to non-Japanese readers. A brief summary: Features: - HyBrick includes a DSSSL renderer and XLink/XPointer engine running on top of SP and Jade - XLink/XPointer are supported on the local file system - XPointer is implemented as a subset of the HyTime property set - Link traversal can use either "New" or "Replace" to display a new page Using HyBrick: - HyBrick is supplied as a self-extracting file. - Once the files are installed, start HyBrick from the bin directory. - Use the "Browse" button to open the file sample\docs\readme.xml. - Click on blue-highlighted areas with the left mouse button to see a list of locations linked to the highlighted location. If only one location is available, traversal to that location is immediate. - Click on blue-highlighted areas with the right mouse button to see the location of that area expressed as an XPointer. Contact info: Please address questions and comments to: hb-staff@ml.flab.fujitsu.co.jp Best regards, Ralph E. Ferris Fujitsu Software Corporation ralph@fsc.fujitsu.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Sung_Nguyen at datacard.com Thu Nov 5 00:16:56 1998 From: Sung_Nguyen at datacard.com (Sung Nguyen) Date: Mon Jun 7 17:06:15 2004 Subject: C++ XML Parser Message-ID: <00153772.3096@datacard.com> Hi: Please point me to the API of any C++ XML Parser - I am using IE5.0 - I cannot find any example to follow - Someone please help. Thanks, SeanN xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Sung_Nguyen at datacard.com Thu Nov 5 00:24:52 1998 From: Sung_Nguyen at datacard.com (Sung Nguyen) Date: Mon Jun 7 17:06:15 2004 Subject: WHERE: mshtml.h msxml.h Message-ID: <0015378C.3096@datacard.com> Hi: I installed IE5.0 and I looked for the two header file mshtml.h msxml.h - I couldn't find them? Do I need anything else to use C++ XML Parser in IE5.0? Please englighten me, SeanN xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From v-jmurr at microsoft.com Thu Nov 5 04:20:14 1998 From: v-jmurr at microsoft.com (John Murray (Murray Info Serv. inc.)) Date: Mon Jun 7 17:06:15 2004 Subject: C++ samples, XML DOM doc Message-ID: C++ examples: http://www.microsoft.com/gallery/samples/xml/c++_samples/default.asp XML DOM reference: http://www.microsoft.com/workshop/xml/xmldom/reference/start.asp Thanks John From: Sung_Nguyen@datacard.com (Sung Nguyen) Date: Wed, 4 Nov 1998 18:21:37 -0600 Subject: WHERE: mshtml.h msxml.h Hi: Please point me to the API of any C++ XML Parser - I am using IE5.0 - I cannot find any example to follow - Someone please help. Thanks, SeanN xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Thu Nov 5 05:45:17 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:06:15 2004 Subject: Unicode, xml:lang, and variant glyphs In-Reply-To: <363F657F.5D2E1B43@locke.ccil.org> Message-ID: <002101be087f$a6d02440$d9e887cb@NT.JELLIFFE.COM.AU> > From: John Cowan > Rick Jelliffe wrote: > > The primary purpose of xml:lang, as far as I am concerned, should be to > > convey the information lost by ISO 10646 unification: where the > > Japanese and Chinese glyphs > > Actually, the problem isn't that clearcut. As John Jenkins posted > to the Unicode list last year: > (..Lots of facts..) FACT: Many times that someone says two characters are variants and should be unified, someone else has used them not as variants. Hence the Unicode compatability area. > > (or Polish and Russian) > > How's that again? Oops I meant Russian and Bylorussian (or Khazak or Ukrainian) where some of the national characters have a different form. > It doesn't lose information about meaning. It may make characters > harder to read, but the distinction is one of typographic tradition, > not language, and can cross languages. Are you are saying that characters carry information, and never glyphs (or character + locale + markup)? You cannot say this without knowing the domain and purpose of the text: if it is mathematics, then the font definitely carries information that the unified character does not. If you have a multi-language dictionary or a list of names which requires exactness, the font (or markup which selects the font) again is important. "Harder to read" is no criterion at all. If it is harder to read, it is because it has lost information. Rick Jelliffe Independent XML/SGML Consultant: FM+SGML a speciality Research Assistant:Computing Center, Academia Sinica, Taipei Author: The XML & SGML Cookbook, Recipes for Structured Information, 1998 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Thu Nov 5 12:31:28 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:06:15 2004 Subject: XML Search Engine Message-ID: <002001be08b7$9dbdeda0$7008e391@bra01wmhkay.bra01.icl.co.uk> >Hi all, >Can anyone tell me where the difference lies in implementing a search >engine for HTML and a search engine for XML. The main difference is that in HTML the tagging is almost useless in localising the query, whereas in XML it is potentially very valuable. Many search engines support field-oriented query, e.g. find "Ireland" as a surname; with the right input filter for XML it becomes possible to map XML elements to the fields understood by the search engine, making such queries a feasible proposition, which is not the case for HTML. Switching thrreads, I am a little surprised by Tim's remarks on word proximity versus character proximity. Confining our attention to European languages (as most search engines do), word proximity searching is a common feature of the high-end search engines, whereas character proximity is hardly found outside basic desktop tools like grep. Apart from anything else, once you've done the word normalisation (normalising different linguistic forms or spellings of the same word), character proximity is meaningless. In the older boolean engines word proximity is used rather mechanistically, in the newer engines it is used more subtly as part of a statistical or linguistic approach to relevance ranking, but either way it is an established feature of the scene, and it is not there on whim: the search algorithms used are based on extensive research and benchmarking of relevance and recall scores. An interesting comparison of web search engines is at http://www.netstrider.com/search/features.html ; this asserts that all the well-known web search engines other than Lycos use word proximity matching. (A good survey in spite of the fact that it fails to distinguish the effectiveness of the query matcher from the effectiveness of the web crawler) Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From kurt at simberg.com Thu Nov 5 13:03:18 1998 From: kurt at simberg.com (Kurt Helenelund) Date: Mon Jun 7 17:06:15 2004 Subject: Creation of XML documents Message-ID: <3641A21B.E5E4513D@simberg.com> I am working on a project where we will use XML to exchange information between applications in different government agenices. We want to implement both on-line access between applications and asynchronous store & forward type of mechanisms. I understand that there are 'lots' of good XML parsers (we have tried some) out there and that SAX and DOM are the prefered ways for applications to 'read' XML structures. I would like to ask if there's anyone that have the opposite problem i.e. for applications to create XML documents on-the-fly. Of course the developer could 'hand code' the XML structures which is error prone and booring . I am looking for something (API, lib) so that we could avoid this. I would like to have a 'library' to which the application developer could say 'using this DTD please instantiate a XML document and help me to fill it in'. Any solutions? -- _______________________________________________________________________ Kurt Helenelund Mobile: +358 50 555 0192 Simberg & Partners Home: +358 9 294 0313 Mielikintie 7B Fax: +358 9 294 0314 FIN-04230 KERAVA, Finland Email: kurt@simberg.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Thu Nov 5 13:40:02 1998 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:06:15 2004 Subject: Character and byte offsets In-Reply-To: Tim Bray's message of Wed, 04 Nov 1998 15:08:39 -0800 Message-ID: <199811051339.NAA00077@cogsci.ed.ac.uk> > Secondly, for proximity, you're worried about counting characters, not > bytes, but for addressing back into the entity, you're worried about byte, > not character, offsets. So it's even harder than it looks. This reminds me - are there good techniques for maintaining a byte offset in conjunction with character-set translations? Ideally you want the translation done in big blocks at a low level, but then how do you access the byte offsets? In RXP/LTXML I keep the offset of the start of the block (which is actually a line), and then (in the case of UTF-8) effectively reverse-translate to calculate how much to add (this relies on UTF-8 being invertible). Surely there must be a better way... -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From north at Synopsys.COM Thu Nov 5 13:52:50 1998 From: north at Synopsys.COM (Simon North) Date: Mon Jun 7 17:06:15 2004 Subject: XML and IE5 beta PR2 Message-ID: <199811051350.OAA03527@goofy.gr05.synopsys.com> For anyone who hasn't noticed, the preview 2 release of IE5 was put on the public servers yesterday (it had been placed there last Friday but was pulled shortly afterwards). Today, Microsoft appear to have put updated documentation on the SBN web pages. The new release doesn't seem to support the style part of XSL, only the transformation part (but it does seem to be nearly 100% compliant, or at least as far as I've had time to check). It will, for example, choke on process-children (unless someone else has got it to work). I have managed to get it to work with simple files such as these: The XML file: Pierre: The Ambiguities Herman Melville 9.99 Heart of Darkness Joseph Conrad 12.99 Arrowsmith Sinclair Lewis 8.99 Oedipus Rex Sophocles 8.99 The Secret Sharer and Other Stories Joseph Conrad 13.99 The Republic Plato 12.99 The Republic Plato 15.99 Pragmatism William James 15.99 and the XSL file:
TITLEAUTHORPRICE
(I quickly hacked these files from the sources available on the SBN site). Without a style sheet, it shows a 'raw' XML tree that you can expand and contract. The DSO and data island mechanisms appear to be intact (bar a few minor changes). It also appears to correctly parse and validate XML code against a DTD; the DTD display is suppressed. I'm now going to tackle XLink and XPointer, although I suspect I know what the results will be. Simon. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ramesh.kasetty at trane.com Thu Nov 5 14:30:31 1998 From: ramesh.kasetty at trane.com (Kasetty, Ramesh) Date: Mon Jun 7 17:06:15 2004 Subject: html, xml Message-ID: <199811051429.IAA05529@nacg.trane.com> Hi, I have knowledge of HTML and trying to learn XML. Can anyone tell me the difference between HTML and XML and where XML can used. Thanks in advance, Ramesh ramesh.kasetty@trane.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Thu Nov 5 14:58:46 1998 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:06:15 2004 Subject: FW: Creation of XML documents Message-ID: <003d01be08cc$9e9e07e0$d3228018@jabr.ne.mediaone.net> There are several answers to this problem, including XML generation classes. A standard way to do this is to persist your data into a DOM object and then ask it to save itself. You might look at Jade (http://www.jclark.com) or IBM's xml4j as a start. Jonathan Borden JABR Technolgy > > > I am working on a project where we will use XML to exchange information > between > applications in different government agenices. We want to implement both > on-line access > between applications and asynchronous store & forward type of > mechanisms. > > I understand that there are 'lots' of good XML parsers (we have tried > some) out there and that SAX and DOM are > the prefered ways for applications to 'read' XML structures. I would > like to ask if there's anyone > that have the opposite problem i.e. for applications to create XML > documents on-the-fly. Of course > the developer could 'hand code' the XML structures which is error prone > and booring . I am looking > for something (API, lib) so that we could avoid this. > > I would like to have a 'library' to which the application developer > could say 'using this DTD please > instantiate a XML document and help me to fill it in'. > > Any solutions? > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jhb at software-ag.de Thu Nov 5 15:10:32 1998 From: jhb at software-ag.de (Juliane Harbarth) Date: Mon Jun 7 17:06:15 2004 Subject: html, xml Message-ID: <008201be08d6$b368dc90$4ba2bd9d@pcjhb.software-ag.de> -----Original Message----- From: Kasetty, Ramesh To: 'xml-dev@ic.ac.uk' Date: Thursday, November 05, 1998 2:43 PM Subject: html, xml Kasetty, Ramesh >I have knowledge of HTML and trying to learn XML. Can anyone tell me the Kasetty, Ramesh >difference between HTML and XML and where XML can used. HTML uses a fixed set of tags, to specify display properties for those things enclosed in the tags. XML allows the definition of tags that enable the specification of semantic properties. In my opinion XML offers the great benefit of being more processable by machines than HTML. That especially holds for retrieval. Who wants to know whether a certain document contains '1234' within

-Tags ? But a question like 'which document contains 1234 as an Employee-Number' makes sense. Juliane Harbarth Technical Consultant Software AG Germany mailto:jhb@software-ag.de Tel +49 (0)6151 92 1147 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Nov 5 15:55:28 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:06:15 2004 Subject: Unicode, xml:lang, and variant glyphs Message-ID: <3.0.32.19981105075248.00b18610@pop.intergate.bc.ca> At 04:46 PM 11/5/98 +1100, Rick Jelliffe wrote: >> > The primary purpose of xml:lang, as far as I am concerned, should be to >> > convey the information lost by ISO 10646 unification: where the >> > Japanese and Chinese glyphs The *only* purpose of xml:lang is to say what language it's in. You need to know this for a *lot* more than picking glyphs. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Thu Nov 5 17:16:29 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:15 2004 Subject: Unicode, xml:lang, and variant glyphs References: <002101be087f$a6d02440$d9e887cb@NT.JELLIFFE.COM.AU> Message-ID: <3641DD79.2DD20660@locke.ccil.org> Rick Jelliffe wrote: > FACT: Many times that someone says two characters are variants and should be > unified, someone else has used them not as variants. Hence the Unicode > compatability area. Unicode had to be round-trip compatible with many character sets formed on different principles. The KSC character sets, e.g. encode some hanja (Chinese character) more than once if they have more than one meaning, for the sake of making hanja-hangeul conversions easy. Nobody denies that these are the same *characters*; even their glyphs are bit for bit the same. > Oops I meant Russian and Bylorussian (or Khazak or Ukrainian) where some of > the national characters have a different form. I don't know about this. Are there really glyphic differences? I know about the character-level differences, like Ukrainian using GHE WITH STROKE except for a period from Stalin till a few years ago, when they were forced to use GHE indiscriminately for GHE and GHE WITH STROKE. I also know about Polish accents, which are properly placed lower over the character than similar-looking Western accents. That certainly is a glyph difference that fine Polish typography should take into account, but getting it wrong does not interfere with *meaning*: it is not a plaintext distinction. (See below.) A borderline case is 8859-2's use of S WITH CEDILLA and T WITH CEDILLA to represent Romanian's S and T WITH COMMA BELOW. This is finally being undone, so that Turkish can keep S WITH CEDILLA and Romanian will get a proper S WITH COMMA BELOW. (Nobody actually needs T WITH CEDILLA.) My *National Geographic* world map uses S WITH CEDILLA in Romanian place names, but you have to look closely and compare with Turkish place names to be sure. > Are you are saying that characters carry information, and never glyphs (or > character + locale + markup)? No, I am talking about the CJK case specifically. A unified font may look ugly, and certainly shouldn't be used for fine typography, but a language indicator is neither necessary nor sufficient to solve this problem. This is not to say that in documents to be finely rendered, an attribute called "cjkv-typographic-tradition" might not be useful. > if it is mathematics, then the font definitely > carries information that the unified character does not. Which is why there are a whole bunch of "letterlike symbols" for math purposes. > If you have a > multi-language dictionary or a list of names which requires exactness, the > font (or markup which selects the font) again is important. Sure, font is important when it's important. My claim is confined to this: that for plain-text purposes, Han unification does not obscure anything essential. > "Harder to read" is no criterion at all. If it is harder to read, it is > because it has lost information. Au contraire. The Unicode definition of a "plain text distinction" is one which is necessary for mere legibility. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Thu Nov 5 17:40:39 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:16 2004 Subject: FW: Creation of XML documents References: <003d01be08cc$9e9e07e0$d3228018@jabr.ne.mediaone.net> Message-ID: <3641E353.4256F5FF@locke.ccil.org> Borden, Jonathan wrote: > A standard way to do this is to persist your data into a DOM object and then > ask it to save itself. You might look at Jade (http://www.jclark.com) or > IBM's xml4j as a start. Alas, the DOM does not provide a standardized way for objects to "save themselves". -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Nov 5 17:52:41 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:06:16 2004 Subject: XML Search Engine Message-ID: <3.0.32.19981105094725.009209b0@pop.intergate.bc.ca> At 12:27 PM 11/5/98 -0000, Michael Kay wrote: >Switching thrreads, I am a little surprised by Tim's remarks on word >proximity versus character proximity. Confining our attention to European >languages (as most search engines do), word proximity searching is a common >feature of the high-end search engines, whereas character proximity is >hardly found outside basic desktop tools like grep. What I said was: 1. I have not seen any research which demonstrates that word proximity achieves better results than character proximity based on any well-known IR metric. 2. Doing word proximity at all is a *very* hard problem in the languages used by a large majority of the world's population. >Apart from anything >else, once you've done the word normalisation (normalising different >linguistic forms or spellings of the same word), character proximity is >meaningless. In the older boolean engines word proximity is used rather >mechanistically, in the newer engines it is used more subtly as part of a >statistical or linguistic approach to relevance ranking If you go poking around either in the SIGIR world (that would be the Association for Computing Machinery's Special Interest Group on Information Retrieval) or in the actual commercial retrieval engine world, you find a distressing lack of technology progress. Yes, with modern engines, precision & recall are measurably better than they were in 1978. But 10 times as good? Hah! Twice as good? Maybe, for certain restricted application domains. Given all this, I'm less than impressed about the subtle techniques of modern engines. On top of which, most of the techniques used in the "advanced" engines are basically Anglocentric and fall apart once you get outside the English-speaking world. > but either way it >is an established feature of the scene, and it is not there on whim: the >search algorithms used are based on extensive research and benchmarking of >relevance and recall scores. Yeah, well, it's *not* an established feature of the scene in Asia. Maybe it's just an irrational prejudice, but I'm not all that interested in computing techniques that are not usable by a large majority of the world's population. And once again, I challenge the assertion that, for all these clever heuristics, real-world retrieval software is really much better than it was 20 years ago. >An interesting comparison of web search engines is at >http://www.netstrider.com/search/features.html ; this asserts that all the >well-known web search engines other than Lycos use word proximity matching. And we know what wonderful results they produce (that's in English; for real joy, go try a tricky in German - even European languages sometimes leave out the spaces between the words - and see what happens). -Tim PS: Given my grouchy tone, I should say that I'm dazzled at the inventiveness, deep thought, and creativity that have been invested in the IR field in recent decades. The fact the results are so underwhelming is evidence of how hard the problems are... the real lesson is that we should marvel at the language-processing apparatus we carry around between our ears. -T xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From avirr at LanMinds.Com Thu Nov 5 18:04:03 1998 From: avirr at LanMinds.Com (Avi Rappoport) Date: Mon Jun 7 17:06:16 2004 Subject: html, xml In-Reply-To: <008201be08d6$b368dc90$4ba2bd9d@pcjhb.software-ag.de> Message-ID: > Kasetty, Ramesh >I have knowledge of HTML and trying to learn XML. Can > anyone tell me the > Kasetty, Ramesh >difference between HTML and XML and where XML can used. > > > HTML uses a fixed set of tags, to specify display properties for those > things > enclosed in the tags. XML allows the definition of tags that enable the > specification of semantic properties. > In my opinion XML offers the great benefit of being more processable by > machines than HTML. That especially holds for retrieval. Who wants to know > whether a certain document contains '1234' within

-Tags ? But a > question like 'which document contains 1234 as an Employee-Number' > makes sense. While I'm obsessed with search and XML, I think that's not going to be the short-term gain with XML. XML is not a set of tags like HTML, it's a set of simple rules for defining tags for your own content. This lets you use XML files for data interchange. Eventually, you'll be able to post them to the Web with associated style sheets and people will view them (but not until 5.0 browsers come out). There are major advantages for using XML files for data storage and especialy for data interchange. XML formats are basically self-documenting and are meant to be both human and machine-readable. That means that you will be able to read the file in 10 or 20 years, unlike most other data structures. You can read and write a valid XML file from any XML-generating application, so you aren't locked into a single program with a proprietary file format. It's based on Unicode, so it's not limited to Western languages. While XML is not very efficient for database access, database programs can read and write XML files very easily. For more information, see , the FAQs at , and the news page at . Hope that helps! Avi ________________________________________________________________ Avi Rappoport, Web Site Search Tools Maven: Guide to Site Indexing and Local Search Engines: xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Thu Nov 5 18:21:03 1998 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:06:16 2004 Subject: XML Search Engine In-Reply-To: <3.0.32.19981105094725.009209b0@pop.intergate.bc.ca> Message-ID: <005901be08e8$e1c63210$d3228018@jabr.ne.mediaone.net> As you say Word/Character proximity searching is not that interesting, and if this is desired, XML doesn't have much to add to the current equation. On the other hand grove based proximity search techniques have also been used since the 1970's when this was called a "semantic network". the advantage is that it is language independent. To date, this hasn't been terribly useful with HTML as not many people care about indexing

tags for example. This where XML has lots to offer and where efforts ought to and are being directed (IHMO). Jonathan Borden JABR Technology http://jabr.ne.mediaone.net Tim Bray wrote: > > > At 12:27 PM 11/5/98 -0000, Michael Kay wrote: > >Switching thrreads, I am a little surprised by Tim's remarks on word > >proximity versus character proximity. Confining our attention to European > >languages (as most search engines do), word proximity searching > is a common > >feature of the high-end search engines, whereas character proximity is > >hardly found outside basic desktop tools like grep. > > What I said was: > 1. I have not seen any research which demonstrates that word proximity > achieves better results than character proximity based on any > well-known IR metric. > 2. Doing word proximity at all is a *very* hard problem in the languages > used by a large majority of the world's population. > > >Apart from anything > >else, once you've done the word normalisation (normalising different > >linguistic forms or spellings of the same word), character proximity is > >meaningless. In the older boolean engines word proximity is used rather > >mechanistically, in the newer engines it is used more subtly as part of a > >statistical or linguistic approach to relevance ranking > > If you go poking around either in the SIGIR world (that would be the > Association for Computing Machinery's Special Interest Group on > Information Retrieval) or in the actual commercial retrieval engine > world, you find a distressing lack of technology progress. Yes, with > modern engines, precision & recall are measurably better than they > were in 1978. But 10 times as good? Hah! Twice as good? Maybe, > for certain restricted application domains. Given all this, I'm > less than impressed about the subtle techniques of modern engines. > On top of which, most of the techniques used in the "advanced" engines > are basically Anglocentric and fall apart once you get outside the > English-speaking world. > > > but either way it > >is an established feature of the scene, and it is not there on whim: the > >search algorithms used are based on extensive research and > benchmarking of > >relevance and recall scores. > > Yeah, well, it's *not* an established feature of the scene in Asia. Maybe > it's just an irrational prejudice, but I'm not all that interested in > computing techniques that are not usable by a large majority of the > world's population. And once again, I challenge the assertion that, > for all these clever heuristics, real-world retrieval software is > really much better than it was 20 years ago. > > >An interesting comparison of web search engines is at > >http://www.netstrider.com/search/features.html ; this asserts > that all the > >well-known web search engines other than Lycos use word > proximity matching. > > And we know what wonderful results they produce (that's in English; for > real joy, go try a tricky in German - even European languages sometimes > leave out the spaces between the words - and see what happens). -Tim > > PS: Given my grouchy tone, I should say that I'm dazzled at the > inventiveness, deep thought, and creativity that have been invested > in the IR field in recent decades. The fact the results are so > underwhelming is evidence of how hard the problems are... the real > lesson is that we should marvel at the language-processing apparatus > we carry around between our ears. -T > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Nov 5 18:42:53 1998 From: david at megginson.com (david@megginson.com) Date: Mon Jun 7 17:06:16 2004 Subject: Creation of XML documents In-Reply-To: <3641A21B.E5E4513D@simberg.com> References: <3641A21B.E5E4513D@simberg.com> Message-ID: <13889.61694.173227.630556@localhost.localdomain> Kurt Helenelund writes: > I would like to ask if there's anyone that have the opposite > problem i.e. for applications to create XML documents > on-the-fly. Of course the developer could 'hand code' the XML > structures which is error prone and booring . I am looking for > something (API, lib) so that we could avoid this. As far as this goes, try using SAX backwards. There are SAX applications that produce normalised XML (James Clark's example at http://www.jclark.com/xml/ might be helpful), and you can just feed canned SAX events to it. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Thu Nov 5 18:49:58 1998 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:06:16 2004 Subject: Creation of XML documents In-Reply-To: <3641E353.4256F5FF@locke.ccil.org> Message-ID: <005a01be08ec$c15cc8a0$d3228018@jabr.ne.mediaone.net> John Cowan wrote: > > Alas, the DOM does not provide a standardized way for objects to > "save themselves". > > This is an unfortunate oversight. DOM implementors thus each create a non-standard mechanism to build a grove. Jonathan Borden JABR xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From fernando at pix.com.br Thu Nov 5 18:52:39 1998 From: fernando at pix.com.br (Fernando Cabral) Date: Mon Jun 7 17:06:16 2004 Subject: XML Search Engine References: <005901be08e8$e1c63210$d3228018@jabr.ne.mediaone.net> Message-ID: <3641ACC1.42FE15C8@pix.com.br> Borden, Jonathan wrote: > As you say Word/Character proximity searching is not that interesting, and > if this is desired, XML doesn't have much to add to the current equation I beg to disagree twice. a) proximity search is very important for any one searchingany reasonably-sized database with a variety of texts; b) XML can help a lot, even thou most non-XML capable search engines can already offer proximity searching. We have bee able to solve quite a number of problems using proximity. If we did not have it we could still be able to solve those problems albeit spending much more effort, time, intelligence and CPU cicles. - fernando -- Fernando Cabral Padrao iX Sistemas Abertos mailto:fernando@pix.com.br http://www.pix.com.br mailto:Pix@Pix.com.br Fone: +55 61 321-2433 Fax: +55 61 225-3082 15? 45' 04.9" S 47? 49' 58.6" W 19? 37' 57.0" S 45? 17' 13.6" W xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Nov 5 18:55:07 1998 From: david at megginson.com (david@megginson.com) Date: Mon Jun 7 17:06:16 2004 Subject: XML Search Engine In-Reply-To: <3.0.32.19981105094725.009209b0@pop.intergate.bc.ca> References: <3.0.32.19981105094725.009209b0@pop.intergate.bc.ca> Message-ID: <13889.62092.676344.644070@localhost.localdomain> Tim Bray writes: > What I said was: > 1. I have not seen any research which demonstrates that word proximity > achieves better results than character proximity based on any > well-known IR metric. > 2. Doing word proximity at all is a *very* hard problem in the languages > used by a large majority of the world's population. I think that there might be a disconnect here. What we're talking about is minimal-semantic-unit proximity -- for some languages/contexts, the minimal semantic unit will always be a single grapheme, and for others, it will be a cluster of one or more graphemes. This type of clustering is critical for search engines, which often (usually?) provide inverse indexes only for minimal semantic units, not for all graphemes. The argument, then, is that proximity testing should be done by counting the units that were indexed, which may or may not be single graphemes. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bobp at lightlink.com Thu Nov 5 19:21:43 1998 From: bobp at lightlink.com (Bob Parks) Date: Mon Jun 7 17:06:16 2004 Subject: html, xml In-Reply-To: References: <008201be08d6$b368dc90$4ba2bd9d@pcjhb.software-ag.de> Message-ID: Avi, I have a large reference work - The Wordsmyth English Dictionary-Thesaurus (WEDT) - on the web at http://www.wordsmyth.net in conjunction with the University of Chicago's ARTFL Project. I am wondering if the selection and definition of XML tags can have a significant impact on the search possibilities. It is clear that XML gives us a chance to mark semantic information. (And a dictionary will give us a chance to create powerful text parsers that can mark much more information than part of speech) But it isn't so clear to me how to define tags that will maximize the flexibility of search routines. Any thoughts? Regards, Bob Parks Associate Professor Elmira College xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Thu Nov 5 19:25:36 1998 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:06:16 2004 Subject: XML Search Engine Message-ID: <005c01be08f1$c8379920$d3228018@jabr.ne.mediaone.net> Let me rephrase that: word/character proximity searching has been done for decades and its utility is well known. The last time I addressed this in detail was when I spent some time on the Hearsay project which was an early speech recognition system during the early 1980's. The problem of german or oriental words/phonemes/sentences etc. is fairly similar (perhaps identical) to the problem of english language speakers who slur their words together. Speech processing programs have made great recent strides yet this has been a difficult nut to crack. There are many people who believe that further refinements of these well known techniques are unlikely to yield dramatic improvements. Instead there are avenues of attack which operate at higher levels on the information food chain, namely at the word phrase, syntactic and semantic levels. These levels are well represented as grove structures and XML/SGML search techniques will likely yield significant results. Natural language processing algorithms naturally express their output in groves and intelligent search is at this crossroad. For example, suppose I am searching for big apples: "This is a little green apple. Big deal." will "Big near apple" match? how about "Big applied to apple" Jonathan Borden JABR http://jabr.ne.mediaone.net > > > Borden, Jonathan wrote: > > > As you say Word/Character proximity searching is not that > interesting, and > > if this is desired, XML doesn't have much to add to the current equation > > I beg to disagree twice. a) proximity search is very important for any > one searchingany reasonably-sized database with a variety of > texts; b) XML can > help a lot, > even thou most non-XML capable search engines can already offer proximity > searching. > > We have bee able to solve quite a number of problems using > proximity. If we > did not have it we could still be able to solve those problems > albeit spending > much more effort, time, intelligence and CPU cicles. > > - fernando xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Nov 5 19:27:03 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:06:16 2004 Subject: Creation of XML documents References: <3641A21B.E5E4513D@simberg.com> Message-ID: <3641FBC2.E8D91C6F@infinet.com> Kurt Helenelund wrote: > I am working on a project where we will use XML to exchange information > between > applications in different government agenices. We want to implement both > on-line access > between applications and asynchronous store & forward type of > mechanisms. > > I understand that there are 'lots' of good XML parsers (we have tried > some) out there and that SAX and DOM are > the prefered ways for applications to 'read' XML structures. I would > like to ask if there's anyone > that have the opposite problem i.e. for applications to create XML > documents on-the-fly. Of course > the developer could 'hand code' the XML structures which is error prone > and booring . I am looking > for something (API, lib) so that we could avoid this. > > I would like to have a 'library' to which the application developer > could say 'using this DTD please > instantiate a XML document and help me to fill it in'. > > Any solutions? Build a DOM tree programmatically and write out the contents. Most DOM packages support this feature or some form of DOM Writer. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Thu Nov 5 19:31:24 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:16 2004 Subject: Full Disclosure: What The XML Processor Must Tell The XML Application Message-ID: <3641FCE4.61D46D15@locke.ccil.org> (Draft 0.1) This is a report gleaned from the XML recommendation, showing what an XML processor (validating or non-validating) must report to the application that invokes it in order to claim conformance to the XML recommendation. Numbers in parentheses refer to clauses. 1. An XML processor must always provide all characters in a document that are not part of markup to the application. (2.10) 2. A validating XML processor must inform the application which non-markup characters are whitespace appearing within element content. (2.10) 3. An XML processor must pass the single character &#A; in place of &#D; or &#D;&#A; appearing in its input. 4. An XML processor must normalize the value of attributes according to the rules in clause 3.3 before passing them to the application. This implies that the value of attributes after normalization are passed to the application. (3.3) 5. An XML processor must pass the identifiers of declared unparsed entities and their associated identifiers to the application. (4, 4.7) 6. When the name of an unparsed entity appears as the explicit or default value of an ENTITY or ENTITIES attribute, an XML processor must provide the names, system identifiers, and (if present) public identifiers of both the entity and its notation to the application (4.6, 4.7) 7. An XML processor must report well-formedness errors in the document entity and in any other entities that it reads. (5.1) 8. A validating XML processor must report violations of the constraints expressed in the DTD, and failures to fulfill validity constraints. All entities included directly or indirectly by the document entity must be examined. (5.1) [The recommendation is self-contradictory on whether this behavior is required always or only at user option: see 5.1 vs. the definition of "validity constraint".] 9. An XML processor must pass processing instructions to the application. (2.6) 10. An XML processor (necessarily a non-validating one) that does not include the replacement text of an external parsed entity in place of an entity reference must notify the application that it recognized but did not read the entity (4.4.3) [SAX does not provide for this] 11. A validating XML processor must include the replacement text of an entity in place of an entity reference. (5.2) 12. A validating XML processor must supply the default value of attributes declared in the DTD for a given element type but not appearing in the element's start tag. (5.2) Have I overlooked anything? -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Thu Nov 5 19:43:22 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:06:16 2004 Subject: Unicode, xml:lang, and variant glyphs In-Reply-To: <3641DD79.2DD20660@locke.ccil.org> Message-ID: <002501be08f4$bbedf850$51ea87cb@NT.JELLIFFE.COM.AU> Rick: > > Are you are saying that characters carry information, and never > > glyphs (or character + locale + markup)? > From: John Cowan > No, I am talking about the CJK case specifically. A unified font > may look ugly, and certainly shouldn't be used for fine typography, > but a language indicator is neither necessary nor sufficient to > solve this problem. But I am not thinking "What is sufficient?", I am thinking "Is something being lost here?" and "how nice is the thing being lost?" If a XML document arrives with an encoding in the XML header of SJIS it will have been created on a Japanese editor: in the absense of any information to the contrary, shouldn't it be displayed using Japanese fonts? And if a document arrives in Big Five, shouldnt it be displayed in the absense of anything else, using a (presumably traditional but this is not clear cut now) Chinese font? In XML terms, if there is no xml:lang in effect, and the sender wrote using a different script variant to the receiver, mightn't heuristic defaulting of xml:lang based on originating character set (and, for example, originating country in the URL) be the desired behaviour for some? And if it is desired in that circumstance, wouldn't it be useful to preserve that information when cutting-and-pasting documents or transcluding portions. (The XML encoding PI presumably will not survive in the grove of every document, so I am not sure it could reliably be available in the case of transcluded data.) I am very loathe to say "everything that you need to know arrives marked-up explicitly" in this particular case. For example, if a Japanese document arrives in XML, and it was originally encoded in shift-JIS, then we should have a suspicion that when there is a backslash character, a Yen glyph might be intended. I know that it would be better to encode the document properly first, but it seems that a policy of choosing a variant font based on the sending encoding (or for that matter, the country in the URL) is just as legitimate a default policy as just using the current-locale's variant font at the receiver. > My claim is confined > to this: that for plain-text purposes, Han unification does not > obscure anything essential. As far as the plain-text distinction, were laypeople actually tested for this, or is it the conjecture of scholars who already know all the variants and their connections (no disrespect intended)? As is the case with fraktur for English readers, if you have not been taught the characters you cannot read them, and if you have been taught them you cannot be tested for whether you can read them. The plain-text criterion may be good for character-set people. But there is no reason to assume that preserving minimal readability is a criterion good enough for documents. I guess this is the PDF versus SGML debate writ small; should fidelity to the originating publication be the policy or should rendering be termined by the setup of the receiver. And maybe it is a content-related thing too: the closer text is to literature or names, the greater the chance that the sender intends a particular glyph variant for the character they chose. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Thu Nov 5 20:08:47 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:06:16 2004 Subject: Full Disclosure: What The XML Processor Must Tell The XML Application Message-ID: <007501be08f7$7b90e3f0$7008e391@bra01wmhkay.bra01.icl.co.uk> >Have I overlooked anything? The interesting thing is that if this list is complete, there are lots of things the spec doesn't say. Most notably it doesn't say that the processor must tell the application anything at all about element tags! It also doesn't mention anything that the processor mustn't tell the application, though some of the statements seem to hint that there are things the application oughtn't to know. There are some things it doesn't say because they're obvious to everyone, e.g. that it must retain the order of the characters in the document and not add any extra ones. Another thing it doesn't say is when the processor must report an error: in particular, it would be nice if it insisted that all data passed to the application before reporting an error must be data that could come from a valid document. (I've hit this problem recently: to avoid my application crashing on bad XML, I have to carry out checks that the parser will carry out anyway, but later) Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Thu Nov 5 20:16:13 1998 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:06:16 2004 Subject: Creation of XML documents References: <3641A21B.E5E4513D@simberg.com> Message-ID: <3642064F.855A5AB5@eng.sun.com> Kurt Helenelund wrote: > > I would like to ask if there's anyone > that have the opposite problem i.e. for applications to create XML > documents on-the-fly. Like some other packages, Sun's supports this. (In fact, I've been a bit surprised how many people call this out as a favorite feature. Even more than like its speed!) http://java.sun.com/jdc/earlyAccess/xml Our model has thus far been that a complete XML package must support a basic "round trip" of data -- though excluding the DTD info. So it's easy to just use XmlDocument.write (writer) to emit XML text. Yes, this transparently handles stuff like "&" and "<" in text. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From fernando at pix.com.br Thu Nov 5 20:16:39 1998 From: fernando at pix.com.br (Fernando Cabral) Date: Mon Jun 7 17:06:17 2004 Subject: XML Search Engine References: <005c01be08f1$c8379920$d3228018@jabr.ne.mediaone.net> Message-ID: <3641C09A.746EE929@pix.com.br> Borden, Jonathan wrote: > For example, suppose I am searching for big apples: > > "This is a little green apple. Big deal." > > will "Big near apple" match? > how about "Big applied to apple" This will not be a poblem with any "decent" text retrieval engine because: a) proximity search can be performed either "ordered" or "non-ordered". This is quite powerful because it allows you to search for "big near potato" in the sentece "This is a small potato, big brother" either to find both "potato, big" as well as "big, potato" or only one of the two. Some search engine, like Stairs (the grandfather of all text-retrieval engines) and BRS have two operator like "near" (or "prox") and "ADJacent", the first one being unordered, the second one being ordered. b) Usually search engine know what phrases and paragraphs are. I don't think proximity should go beyond a period or any other punctuation that ends a sentence. If you want to search in larger units, like a paragraph, then you could always define something like "apple SAME PARAGRAPH big" or "apple SAME SENTENCE big", both of with extend the idea of "nearness" providing a more logical view of the terms. c) finally, growing from the very close vicinity (near/adjacent) to a little further (same sentence/same paragraph) you can go to the whole "universe" with AND, OR, XOR, etc. What this means is that you can have a very good control not only on which words you want, but also where they, how far apart they can be, which one comes first... d) XML allows you to use all the above operators adding a very useful feature: tag-qualification. - fernando -- Fernando Cabral Padrao iX Sistemas Abertos mailto:fernando@pix.com.br http://www.pix.com.br mailto:Pix@Pix.com.br Fone: +55 61 321-2433 Fax: +55 61 225-3082 15? 45' 04.9" S 47? 49' 58.6" W 19? 37' 57.0" S 45? 17' 13.6" W xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lauren at sqwest.bc.ca Thu Nov 5 20:20:40 1998 From: lauren at sqwest.bc.ca (Lauren Wood) Date: Mon Jun 7 17:06:17 2004 Subject: Creation of XML documents In-Reply-To: <005a01be08ec$c15cc8a0$d3228018@jabr.ne.mediaone.net> References: <3641E353.4256F5FF@locke.ccil.org> Message-ID: <199811052011.MAA05387@sqwest.bc.ca> At 05/11/1998 10:47 AM , Borden, Jonathan wrote: >John Cowan wrote: >> >> Alas, the DOM does not provide a standardized way for objects to >> "save themselves". > > This is an unfortunate oversight. DOM implementors thus each create a >non-standard mechanism to build a grove. I prefer to call it a "not yet" than an oversight; so far only DOM Level 1 is available. There is lots of work yet to do, including standardised ways to read documents, serialise documents, validate documents, .... ideas on prioritising all of this are welcome, but I'd suggest you send them to the public DOM mailing list (www-dom@w3.org; to subscribe, send email to www-dom-request@w3.org with the subject "subscribe"). cheers, Lauren xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Thu Nov 5 21:12:34 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:17 2004 Subject: Unicode, xml:lang, and variant glyphs References: <002501be08f4$bbedf850$51ea87cb@NT.JELLIFFE.COM.AU> Message-ID: <364214F7.798593A4@locke.ccil.org> Rick Jelliffe wrote: > If a XML document arrives with an encoding in the XML header of SJIS it will > have been created on a Japanese editor: in the absense of any information to > the contrary, shouldn't it be displayed using Japanese fonts? Perhaps as a heuristic. But I find it very hard to swallow that the charset encoding of a document is part of its semantics. Would you assume that, in the absence of other evidence, a document in ASCII was in en-US? And if so, what assumption would you make about a 8859-1 document? > And if a > document arrives in Big Five, shouldnt it be displayed in the absense of > anything else, using a (presumably traditional but this is not clear cut > now) Chinese font? Note: Contrary to a common assumption, Unicode does *not* unify simplified hanzi with their traditional counterparts. > I am very loathe to say "everything that you need to know arrives marked-up > explicitly" in this particular case. For example, if a Japanese document > arrives in XML, and it was originally encoded in shift-JIS, then we should > have a suspicion that when there is a backslash character, a Yen glyph might > be intended. If it is really encoded in SJIS, then an \x5C byte represents a yen character, not a backslash, and had better be treated as such by the application. Of course, since the document character set is always 10646, a \ character reference means a backslash, not a yen symbol. Ditto for KSC with a won symbol (U+20A9). > As far as the plain-text distinction, were laypeople actually tested for > this, or is it the conjecture of scholars who already know all the variants > and their connections (no disrespect intended)? I don't know, as I am not part of the Ideographic Rapporteur Group and find their documents very hard to follow. > The plain-text criterion may be good for character-set people. But there is > no reason to assume that preserving minimal readability is a criterion good > enough for documents. No doubt it is not. The point is that anything that is not a plain text distinction should be encoded using our favorite markup mechanism: XML. > I guess this is the PDF versus SGML debate writ small; > should fidelity to the originating publication be the policy or should > rendering be termined by the setup of the receiver. In the end the receiver always controls: a variant PDF renderer could exist, although there's no reason for it to. Fidelity to the originating publication is a reasonable goal, but requires reasonable cooperation. > And maybe it is a > content-related thing too: the closer text is to literature or names, the > greater the chance that the sender intends a particular glyph variant for > the character they chose. Very true, which is why I am interested to hear about methods for explicitly encoding variants. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tony.peters at qr.com.au Thu Nov 5 22:22:27 1998 From: tony.peters at qr.com.au (Tony Peters) Date: Mon Jun 7 17:06:17 2004 Subject: unsubscribe Message-ID: unsubscribe xml-dev xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sharris at primus.com Thu Nov 5 22:34:37 1998 From: sharris at primus.com (Steve Harris) Date: Mon Jun 7 17:06:17 2004 Subject: Namespaces - defaulting question (re:WD-xml-names19980916) Message-ID: <293509DEBE37D211BA5100805F9F92AE5ED3A2@exchange1.primus.com> Reading the Namespaces specification, I've found contradictory examples of namespace defaulting. Any input in determining the proper interpretation would be greatly appreciated. The first example is as follows: Here, the inline comment suggests that the element 'x' now lies within the 'edi' namespace. Later in the specification, we find this example (slightly reformatted for clarity): Layman, A 33B Check Status 1997-05-24T07:55:00+1 The specification authors break this instance apart into the 'Expanded Element Types and Attribute Names' meta-markup to illustrate proper interpretation. There, the element 'RESERVATION' is not associated with the namespace prefix 'HTML'. This contradicts the first example's binding. Section 5.2 reads, "A _default namespace_ is considered to apply to the element where it is declared (if that element has no namespace prefix), and to all elements with no prefix within the content of that element." Elsewhere, Section 2 defines the term 'default namespace' as: [Definition]: If the colon and NCName are not provided, then the associated _namespace name_ is that of a *default namespace* in the scope of the element to which the declaration is attached. By these two definitions, it sounds like the first example above is incorrect. The 'edi' prefix should not bind to element 'x' since the declaration is not a default namespace declaration. Please confirm or deny this assertion. I apologize if this topic has been covered before. Scouring the archives didn't turn up anything that directly answers the question. Steven E. Harris Software Engineer PRIMUS 1601 Fifth Avenue, Suite 1900 Seattle, Washington 98101 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Thu Nov 5 22:42:36 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:17 2004 Subject: Namespaces - defaulting question (re:WD-xml-names19980916) References: <293509DEBE37D211BA5100805F9F92AE5ED3A2@exchange1.primus.com> Message-ID: <36422A22.77C65887@locke.ccil.org> Steve Harris wrote: > Reading the Namespaces specification, I've found contradictory examples > of namespace defaulting. Any input in determining the proper > interpretation would be greatly appreciated. Your problem is that you are confusing *scope* with *defaulting*. The scope of a namespace declaration is the whole of the element in which it is contained, including the element type. So the comment below is true: > The first example is as follows: > > > > > Here, the inline comment suggests that the element 'x' now lies within > the 'edi' namespace. So it does, in the sense that the scope of the "edi:' prefix is the element "x". But the name "x" is not defaulted to the edi namespace; it would have to be "edi:x" for that, or the edi namespace would have to also be the default namespace through the presence of an "xmlns='http://ecommerce.org/schema'" attribute. Instead the namespace of "x" is the current default namespace if any, or no namespace if there is no current default. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Nov 5 22:45:47 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:06:17 2004 Subject: Namespaces - defaulting question (re:WD-xml-names19980916) Message-ID: <3.0.32.19981105144302.00b62510@pop.intergate.bc.ca> At 02:32 PM 11/5/98 -0800, Steve Harris wrote: >Reading the Namespaces specification, I've found contradictory examples >of namespace defaulting. Any input in determining the proper >interpretation would be greatly appreciated. > The first example is as follows: > > > > Good catch. This comment is erroneous - the namespace only applies to things that are prefixed "edi", which the "x" element clearly isn't. The comment should say that the binding of the prefix "edi" to the URI "http://ecommerce.org/schema" appliesto the "x" element and contents. To be fixed. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri Nov 6 01:55:21 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:06:17 2004 Subject: XML Search Engine References: <005c01be08f1$c8379920$d3228018@jabr.ne.mediaone.net> <3641C09A.746EE929@pix.com.br> Message-ID: <36425662.70BB@hiwaay.net> Fernando Cabral wrote: > > > This will not be a poblem with any "decent" text retrieval engine This is an interesting thread and relevant to my current work which is unfortunatly, not in XML, but in relational implementations. However, can someone direct me to any sites which provide source examples of full text retrieval engines? Please reply to clbullar@ingr.com Thanks in advance. Len Bullard IPS xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Nov 6 02:08:47 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:06:17 2004 Subject: XML Search Engine Message-ID: <3.0.32.19981105180610.00b862d0@pop.intergate.bc.ca> At 07:52 PM 11/5/98 -0600, len bullard wrote: >This is an interesting thread and relevant to my current work which >is unfortunatly, not in XML, but in relational implementations. >However, can someone direct me to any sites which provide source >examples of full text retrieval engines? Such sites will be rather small, due to a little problem in the retrieval business, namely nobody has ever made serious money at it. Five years ago, I would have said the leading vendors were Fulcrum, Verity, PLS, Open Text, and IDI/Basis. Fulcrum barely dodged bankruptcy and ended up being swallowed by PC Docs, a low-end document management company. Verity is the only one still soldiering on, having burned through $30M in venture cap and a large part of their IPO bucks, are actually showing some signs of small amounts of black ink. PLS staggered (mostly) into the arms of AOL. Open Text retreated from search into document management, and bought IDI. Lesson: there's not much juice in that business. XML might cheer things up a bit, you never know. There are any number of decent free search engines you can run with either Apache or NT servers... If you're doing relational search, most relational vendors (Oracle, Informix, etc) have some sort of full-text add-on that usually works OK. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Fri Nov 6 04:16:55 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:06:17 2004 Subject: XML and IE5 beta PR2 Message-ID: <199811060413.UAA23811@boethius.eng.sun.com> [Simon North:] | For anyone who hasn't noticed, the preview 2 release of IE5 was put | on the public servers yesterday (it had been placed there last Friday | but was pulled shortly afterwards). Today, Microsoft appear to have | put updated documentation on the SBN web pages. | | The new release doesn't seem to support the style part of XSL, only | the transformation part [...] It's my impression that Microsoft sees XSL simply as a way to do tag transformation and that their strategy for XML display is to use XSL to transform XML tags to HTML tags. This means, of course, that you will not be able to use Microsoft tools that support XSL to do any formatting more complex than what can be expressed using HTML+CSS. I would be delighted to learn that I have gotten the wrong impression about this. If anyone finds out something that contradicts this assessment, such as a public statement of support for formatting objects or Microsoft software that supports formatting objects, please let me know. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clovett at microsoft.com Fri Nov 6 04:20:39 1998 From: clovett at microsoft.com (Chris Lovett) Date: Mon Jun 7 17:06:17 2004 Subject: IE5 - Retrieving attributes from an internal entity Message-ID: <2F2DC5CE035DD1118C8E00805FFE354C08743F7A@RED-MSG-56> The &om; node is an ENTITYREF node which has a read-only child node called "variable". The "xml" property on the ENTITYREF node gives back "&om;". If you want to get the markup "OM" you have to navigate to the "variable" child, then use the "xml" property from there. -----Original Message----- From: Don Kackman [mailto:DKACKMAN@agchem.com] Sent: Monday, November 02, 1998 1:11 PM To: 'XML Dev' Subject: Retrieving attributes from an internal entity Hello, I'm using Microsoft's XML parser that comes as part of IE 5 beta 1 as a component of an application that will use XML as its document format. Since IE5 is still a beta I'm having some trouble determining if certain behaviors are bugs in the current version of their parser or correctly reflect the W3C specification. Namely I'm using an internal entity declaration as follows: OM"> as part of the internal part of the DTD. I can load the document into MSXML (thier parser) and traverse the node tree. When I get to the node where I am refering to the &om; entity I get OM back as the value of that node but I cannot retrieve the targetset attribute. It is my understanding that internal entities should be parsed in place when they are refered to, which should mean that I can treat that node as I would any other. This does not seem to be the case with the MS parser. Is this a limitation of the MS beta parser or am I misunderstanding how entities are used in XML? Thank you, Don Kackman dkackman@agchem.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jswleung at hkbu.edu.hk Fri Nov 6 04:33:09 1998 From: jswleung at hkbu.edu.hk (Josef Siu-wai Leung) Date: Mon Jun 7 17:06:17 2004 Subject: XML and IE5 beta PR2 In-Reply-To: <199811060413.UAA23811@boethius.eng.sun.com> Message-ID: On Thu, 5 Nov 1998, Jon Bosak wrote: > It's my impression that Microsoft sees XSL simply as a way to do tag > transformation and that their strategy for XML display is to use XSL > to transform XML tags to HTML tags. This means, of course, that you > will not be able to use Microsoft tools that support XSL to do any > formatting more complex than what can be expressed using HTML+CSS. I find the same in this beta version. And I would like to know which is the most XSL compliant browser at the moment. Any advice? Josef xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From avirr at LanMinds.Com Fri Nov 6 05:52:33 1998 From: avirr at LanMinds.Com (Avi Rappoport) Date: Mon Jun 7 17:06:17 2004 Subject: XML Search Engine In-Reply-To: <36425662.70BB@hiwaay.net> References: <005c01be08f1$c8379920$d3228018@jabr.ne.mediaone.net> <3641C09A.746EE929@pix.com.br> Message-ID: At 7:52 PM -0600 11/5/98, len bullard wrote: > This is an interesting thread and relevant to my current work which > is unfortunatly, not in XML, but in relational implementations. > However, can someone direct me to any sites which provide source > examples of full text retrieval engines? > > Please reply to clbullar@ingr.com I'm replying publicly because Tim did, and I happen to know of some available source code: ht:Dig SWISH-E SWISH++ WebGlimpse ICE (Perl) BBDBot (Java) More info on these topics on my Web Site Search Tools site: . Avi ________________________________________________________________ Avi Rappoport, Web Site Search Tools Maven: Guide to Site Indexing and Local Search Engines: xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Nov 6 05:57:03 1998 From: david at megginson.com (david@megginson.com) Date: Mon Jun 7 17:06:17 2004 Subject: Full Disclosure: What The XML Processor Must Tell The XML Application In-Reply-To: <3641FCE4.61D46D15@locke.ccil.org> References: <3641FCE4.61D46D15@locke.ccil.org> Message-ID: <13890.36667.687028.56131@localhost.localdomain> John Cowan writes: > 10. An XML processor (necessarily a non-validating one) that does > not include the replacement text of an external parsed entity > in place of an entity reference must notify the application that > it recognized but did not read the entity (4.4.3) [SAX does not > provide for this] SAX offloads this responsibility to the application using the EntityResolver interface. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From north at Synopsys.COM Fri Nov 6 09:03:45 1998 From: north at Synopsys.COM (Simon North) Date: Mon Jun 7 17:06:17 2004 Subject: MS and XSL (was XML and IE5 beta PR2) In-Reply-To: <199811060413.UAA23811@boethius.eng.sun.com> Message-ID: <199811060901.KAA23380@goofy.gr05.synopsys.com> [Jon Bosak:] > It's my impression that Microsoft sees XSL simply as a way to do tag > transformation and that their strategy for XML display is to use XSL > to transform XML tags to HTML tags. This means, of course, that you > will not be able to use Microsoft tools that support XSL to do any > formatting more complex than what can be expressed using HTML+CSS. That is more or less the impression that I took away from my visit to Redmond for the XML Summit in July, with a few qualifications. As I understood it, Microsoft are committed to using XML as an intermediate format ("islands of data"). DHTML is the display tag language of choice. XSL is only of interest in so far as it can be used to transform XML into DHTML, but XSL is not necessarily the language of choice because (and this is borne out by the fact the IE5 appears to be fully DOM compliant) the DOM gives as much, if not more accessibility. It was my understanding that MS were heavily committed to CSS, but they intend to extend it to support 'CSS behaviors', which allows executable code to be attached to elements via style sheets. This would promote re-usable code, would ease the 'coder bottleneck' that currently seems to threaten web page production efforts, and would introduce a form of 'no-install' software (a contender for Java applets?). Simon North. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Fri Nov 6 11:44:39 1998 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:06:18 2004 Subject: CDATA by any other name... (was The raw and the cooked) References: <199811011654.IAA02758@sqwest.bc.ca> <363CF738.C389AB26@technologist.com> <363DCFB4.36A59E21@locke.ccil.org> <363DE9F3.2EDF9979@eng.sun.com> <363F060D.47FB0E2C@technologist.com> Message-ID: <3642E010.DA093709@eng.sun.com> Paul Prescod wrote: > David Brownell wrote: > > To put it differently: is there really room for another API > > to represent XML structure? > > > > I tend to think that DOM, warts and all, is "good enough" for > > most purposes. And for those other purposes, I suspect that > > no standard API could suit. > > I find it odd that we can have "standard APIs" for the full complexity of > relational data, and probably eventually for object database data, but it > is perceived to be impossible to do the same for the parse tree of XML > data. I mean it is just annotated tree structures: it shouldn't be rocket > science (but neither is it trivial). If it's just annotated tree structures, I'd say that's what DOM is for! Or should be, warts and XML Data Model conformance aside. Why would I say no "standard API" would exist for the rest? There are thousands (conservatively!) of data/object models specialized to each application. While a tree (or grove, or graph) would seem isomorphic with any such model, it's not necessarily optimal for any one of them. Similarly, "pure data" is a model many of us have been moving folk away from over the last decade. It's critical for interoperability between system components (e.g. over the web, with XML!), but raw data must be joined with methods (or other code) before it's used. Ergo, "objects" instead of APIs to data; classes not structs. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From fernando at pix.com.br Fri Nov 6 12:15:56 1998 From: fernando at pix.com.br (Fernando Cabral) Date: Mon Jun 7 17:06:18 2004 Subject: XML Search Engine References: <3.0.32.19981105180610.00b862d0@pop.intergate.bc.ca> Message-ID: <3642A141.5B2D3EE0@pix.com.br> Tim Bray wrote: > Such sites will be rather small, due to a little problem in > the retrieval business, namely nobody has ever made serious > money at it. Five years ago, I would have said the leading > vendors were Fulcrum, Verity, PLS, Open Text, and IDI/Basis. > Fulcrum has had its moment of glory. The same can not be saidabout the others. Nevertheless, you've forgotten a very important name: Dataware Technologies (http://www.dataware.com). Dataware grew from 0 to several million dollars in a few years selling text-retrieval systems for CDs (about $40MB/year). Then it bought BRS, with more than 2,000 data centers. BRS is still the leading product in text retrieval on a variety of platforms. Just to mention libraries alone, there are more than 200 big, big libraries using BRS. About two years ago Dataware launched EPMS, now renamed Dataware II Publisher. This is a version of BRS entirely based on SGML (it reads from about 300 different formats, converts and stores as an SGML file, and allows you to do text retrieval both in the traditional way as well as in a more SGML-like way. Of course, it can read and index directly SGML, XML and HTML. > Lesson: there's not much juice in that business. XML might cheer > things up a bit, you never know. There are any number of decent > free search engines you can run with either Apache or NT servers... > Talking about money, it is quite clear that IBM made a lotof money selling STAIRS. Now it is musty but for more than 20 years it reigned undisputed undisputed in the mainframe kingdom. So, I think the right conclusion is that in the low-end line of products where quality/functionality is disputable and price is very low (PC DOCs, Verity...) there is no real money. On the other hand, vendors aiming the high-end market should not complain. > If you're doing relational search, most relational vendors (Oracle, > Informix, etc) have some sort of full-text add-on that usually > works OK. Own experience is that relational vendors are complete uncapableof providing a good solution for text retrieval. The products are usually very poor on the funcionality side and miserable on the performance side. In fact, I'd like to hear from any of you that know any SIGNIFICANT application using any relational database for text-retrieval. By significant I mean: a) several giga or even terabytes of text; b) several millions of documents; c) at least a few dozens of concurrent users; d) need of complex searchs (say 20 or 30 words/parts of words combined with 4 or 5 different operators); d) response time bellow one second in a common UNIX or mainframe platform. If any of you have ever heard about such an application, I am eager to hear about it. - fernando -- Fernando Cabral Padrao iX Sistemas Abertos mailto:fernando@pix.com.br http://www.pix.com.br mailto:Pix@Pix.com.br Fone: +55 61 321-2433 Fax: +55 61 225-3082 15? 45' 04.9" S 47? 49' 58.6" W 19? 37' 57.0" S 45? 17' 13.6" W xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Nov 6 14:27:33 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:18 2004 Subject: Namespaces - defaulting question (re:WD-xml-names19980916) References: <3.0.32.19981105144302.00b62510@pop.intergate.bc.ca> Message-ID: <3643078E.F8C1B652@locke.ccil.org> Tim Bray wrote: > At 02:32 PM 11/5/98 -0800, Steve Harris wrote: > >Reading the Namespaces specification, I've found contradictory examples > >of namespace defaulting. Any input in determining the proper > >interpretation would be greatly appreciated. > > The first example is as follows: > > > > > > > > > > Good catch. This comment is erroneous - the namespace only applies > to things that are prefixed "edi", which the "x" element clearly > isn't. I think the comment is correct as it stands, though I agree it could be clearer. Element isn't element tag/GI. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Nov 6 14:55:01 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:18 2004 Subject: Full Disclosure: What The XML Processor Must Tell The XML Application References: <3641FCE4.61D46D15@locke.ccil.org> <13890.36667.687028.56131@localhost.localdomain> Message-ID: <36430DFD.2770E438@locke.ccil.org> david@megginson.com wrote: > > 10. An XML processor (necessarily a non-validating one) that does > > not include the replacement text of an external parsed entity > > in place of an entity reference must notify the application that > > it recognized but did not read the entity (4.4.3) [SAX does not > > provide for this] > > SAX offloads this responsibility to the application using the > EntityResolver interface. But there's no way for the application to tell the parser "Don't read this entity." Returning an InputSource on the null string is similar, but not identical, in effect. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From anette.engel at crpht.lu Fri Nov 6 15:37:26 1998 From: anette.engel at crpht.lu (anette.engel@crpht.lu) Date: Mon Jun 7 17:06:18 2004 Subject: Storing XML documents in an OODB Message-ID: I want to write a Java- application which stores and retrieves XML documents in/from an object-oriented Database (Object-Store). I already had a closer look to Sun's package and to documentation provided by ObjectStore. Nevertheless I am not yet sure how to store/retrieve XML documents in the OODB. Do I have to wrrite my own document class or can I take advantage of existing implemtations? Does anyone know of some interesting web-sites which cover similar problems? Regards Anette Engel xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pbutkiew at banta-im.com Fri Nov 6 16:10:34 1998 From: pbutkiew at banta-im.com (Paul Butkiewicz) Date: Mon Jun 7 17:06:18 2004 Subject: Storing XML documents in an OODB In-Reply-To: Message-ID: <000701be099f$cafe0e10$6e34ccd0@scriabin.dcm.com> I don't have any direct experience with Object Store, but for what it's worth I do have a fair amount of background using Versant's OODB with Java. With Versant, non-serialized persistent objects must extend other persistent objects, so you might have some difficulty trying to use third party objects. Versant provides a post-processor that takes .class files and modifies them to be persistent and you might be able to use that on a third party class library, but some functionality may require vendor- or OMG-specific collection classes or an intimacy with you XML implementation classes themselves (for example, do you want to search on these objects?). And then there's the pesky details --- if an internal function in a closed implementation dereferences a now-persistent object expecting it to be garbage collected by Java, now the dereferenced object will hang around the database taking up space. Anyway, I would recommend finding an open-source implementation that you can tweak to work with Object Store, waiting for Object Store or a third party to release a DOM implementation or whatever, or find someone otherwise willing to share their source for you to tweak. I probably fall into that last category, but my code isn't quite ready for prime time yet. :) Paul Butkiewicz arabbit@earthlink.net -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of anette.engel@crpht.lu Sent: Friday, November 06, 1998 10:33 AM To: xml-dev@ic.ac.uk Subject: Storing XML documents in an OODB I want to write a Java- application which stores and retrieves XML documents in/from an object-oriented Database (Object-Store). I already had a closer look to Sun's package and to documentation provided by ObjectStore. Nevertheless I am not yet sure how to store/retrieve XML documents in the OODB. Do I have to wrrite my own document class or can I take advantage of existing implemtations? Does anyone know of some interesting web-sites which cover similar problems? Regards Anette Engel xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Nov 6 16:24:49 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:06:18 2004 Subject: XML Search Engine Message-ID: <3.0.32.19981106075715.00b71540@pop.intergate.bc.ca> At 10:12 AM 11/6/98 +0200, Fernando Cabral wrote: >Nevertheless, you've forgotten a very >important name: Dataware Technologies (http://www.dataware.com). > >Dataware grew from 0 to several million dollars in a few years >selling text-retrieval systems for CDs (about $40MB/year). Then >it bought BRS, with more than 2,000 data centers. Well, I just went and checked their financials, and while they actually showed a bit of profit I observe that their revenue, for this quarter against the same quarter last year, is declining (down from $5.4m to $4.8m) and the 3/4 results are down from $14.6m to $14.2m. This is the high-growth Internet/Software field? I repeat my claim that this is not a good busines to be in. >About two years ago Dataware launched EPMS, now renamed >Dataware II Publisher. This is a version of BRS entirely based >on SGML (it reads from about 300 different formats, converts >and stores as an SGML file, and allows you to do text retrieval >both in the traditional way as well as in a more SGML-like way. > >Of course, it can read and index directly SGML, XML and HTML. Can anyone else substantiate this? Last time I looked at BRS/search, it was a very traditional atomic-document thing; it had some fielded search, but it could only *find* documents. Obviously for XML you need to find elements. It would be great if BRS was really SGML/XML-savvy. >Talking about money, it is quite clear that IBM made a lotof money selling >STAIRS. Now it is musty but for more >than 20 years it reigned undisputed undisputed in the mainframe >kingdom. Here we agree. IBM made some serious money with STAIRS (I suspect mostly by selling mainframes to run it on). >So, I think the right conclusion is that in the low-end line of products >where quality/functionality is disputable and price is very low >(PC DOCs, Verity...) there is no real money. On the other hand, >vendors aiming the high-end market should not complain. I don't the evidence supports this view. >Own experience is that relational vendors are complete uncapableof providing a >good solution for text retrieval. The products >are usually very poor on the funcionality side and miserable on >the performance side. That's really interesting information. This is about the 5th time I've heard this; always anecdotal evidence, but it adds up. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Fri Nov 6 19:21:49 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:06:18 2004 Subject: XML Search Engine References: <3.0.32.19981105180610.00b862d0@pop.intergate.bc.ca> <3642A141.5B2D3EE0@pix.com.br> Message-ID: <36434AF0.FA7400B3@infinet.com> Fernando Cabral wrote: > Tim Bray wrote: > > > Such sites will be rather small, due to a little problem in > > the retrieval business, namely nobody has ever made serious > > money at it. Five years ago, I would have said the leading > > vendors were Fulcrum, Verity, PLS, Open Text, and IDI/Basis. > > > > Fulcrum has had its moment of glory. The same can not be saidabout the others. > Nevertheless, you've forgotten a very > important name: Dataware Technologies (http://www.dataware.com). > > Dataware grew from 0 to several million dollars in a few years > selling text-retrieval systems for CDs (about $40MB/year). Then > it bought BRS, with more than 2,000 data centers. > > BRS is still the leading product in text retrieval on a variety > of platforms. Just to mention libraries alone, there are more > than 200 big, big libraries using BRS. > > About two years ago Dataware launched EPMS, now renamed > Dataware II Publisher. This is a version of BRS entirely based > on SGML (it reads from about 300 different formats, converts > and stores as an SGML file, and allows you to do text retrieval > both in the traditional way as well as in a more SGML-like way. > > Of course, it can read and index directly SGML, XML and HTML. > > > Lesson: there's not much juice in that business. XML might cheer > > things up a bit, you never know. There are any number of decent > > free search engines you can run with either Apache or NT servers... > > > > Talking about money, it is quite clear that IBM made a lotof money selling > STAIRS. Now it is musty but for more > than 20 years it reigned undisputed undisputed in the mainframe > kingdom. > > So, I think the right conclusion is that in the low-end line of products > where quality/functionality is disputable and price is very low > (PC DOCs, Verity...) there is no real money. On the other hand, > vendors aiming the high-end market should not complain. > > > If you're doing relational search, most relational vendors (Oracle, > > Informix, etc) have some sort of full-text add-on that usually > > works OK. > > Own experience is that relational vendors are complete uncapableof providing a > good solution for text retrieval. The products > are usually very poor on the funcionality side and miserable on > the performance side. > > In fact, I'd like to hear from any of you that know any SIGNIFICANT > application using any relational database for text-retrieval. By significant > I mean: a) several giga or even terabytes of text; b) several millions of > documents; c) at least a few dozens of concurrent users; d) need of > complex searchs (say 20 or 30 words/parts of words combined > with 4 or 5 different operators); d) response time bellow one second > in a common UNIX or mainframe platform. > > If any of you have ever heard about such an application, I am eager > to hear about it. I have heard that CONText from Oracle was pretty good. I am not sure about its performance though, but Oracle used to make a huge deal about this "cartridge" a while back. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From 106551.310 at compuserve.com Fri Nov 6 20:32:19 1998 From: 106551.310 at compuserve.com (Ted Carroll & Claire Samson) Date: Mon Jun 7 17:06:18 2004 Subject: XML for resource scheduling / calender management Message-ID: <199811061530_MC2-5F5F-A23B@compuserve.com> I've been interested in software for scheduling groups of resources, such as staff members within a workgroup, organising meeting rooms within a company or booking equipment - as a way of producing a 'virtual' office wall planner'. I've developed a multi-user program for this type of activity and have also developed a free-standing viewer for displaying a virtual 'wall planner' file. This uses XML - XML provides a good way of formating fairly complex data of this type. XML is also used to allow remote updates to the schedule - i.e. a remote request to book a room could be made using a small applet that sends a message in the appropriate XML format. I would be interested to know if anyone on the list is aware of any other developments in the use of XML for diary information such as maintaining shared calendars/scheduling information. I am sure there is a huge potential for an open format based on XML. The free software (both the multi-user program and the XML-based 'wall planner' viewer can be downloaded from www.screenplan.com (It's Windows 95/98/NT at the moment - a Java viewer may follow!) J.C.Samson xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From boblyons at unidex.com Fri Nov 6 20:38:23 1998 From: boblyons at unidex.com (Robert C. Lyons) Date: Mon Jun 7 17:06:18 2004 Subject: Full Disclosure: What The XML Processor Must Tell The XML Application Message-ID: <01BE099B.0F2A64D0@cc398234-a.etntwn1.nj.home.com> John, This list is very helpful. Thanks. I noticed some minor typos. List item #3 should be changed to the following (the changes are in bold): 3. An XML processor must pass the single character in place of or appearing in its input. (2.11) Bob ------ Bob Lyons EC Consultant Unidex Inc. 1-732-975-9877 Fax: 1-732-975-9866 boblyons(at)unidex.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From fernando at pix.com.br Fri Nov 6 20:46:23 1998 From: fernando at pix.com.br (Fernando Cabral) Date: Mon Jun 7 17:06:18 2004 Subject: XML Search Engine References: <3.0.32.19981105180610.00b862d0@pop.intergate.bc.ca> <3642A141.5B2D3EE0@pix.com.br> <36434AF0.FA7400B3@infinet.com> Message-ID: <36431886.99A2F1F8@pix.com.br> Tyler Baker wrote: > I have heard that CONText from Oracle was pretty good. I am not sure about its > performance though, but Oracle used to make a huge deal about this "cartridge" a > while back. Oracle has had several different names for this product (SQL-TEXT, CONText, TextServer). No matter what the name is, it still does work well. At least I've never been able to see a single good application at work. In fact, even some very simple applications have lots of flaws. - fernando -- Fernando Cabral Padrao iX Sistemas Abertos mailto:fernando@pix.com.br http://www.pix.com.br mailto:Pix@Pix.com.br Fone: +55 61 321-2433 Fax: +55 61 225-3082 15? 45' 04.9" S 47? 49' 58.6" W 19? 37' 57.0" S 45? 17' 13.6" W xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From estephen at appliedtheory.com Fri Nov 6 22:10:10 1998 From: estephen at appliedtheory.com (Eric A. Stephens) Date: Mon Jun 7 17:06:18 2004 Subject: Announcing the XML-HR Initiative - www.xml-hr.org Message-ID: In partnership with AppliedTheory Communications (www.appliedtheory.com), the America's Job Bank Service Center (www.ajb.dni.us) announces the XML-HR Initiative. The purpose of the initiative is to collaborate on the creation and standardization of human resource/electronic recruiting XML definitions. These definitions (DTDs) can be used in Internet/Intranet applications and the exchange of such data between disparate systems. Upcoming releases of AJB will rely heavily on XML definitions for both internal and external data exchange between processes and applications. Those interested in this initiative are encouraged to subscribe to one of the e-mail lists described on the web page (www.xml-hr.org). Eric A. Stephens estephen@AppliedTheory.com Software Engineering Group http://www.AppliedTheory.com AppliedTheory Communications, Inc. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From fernando at pix.com.br Fri Nov 6 22:18:41 1998 From: fernando at pix.com.br (Fernando Cabral) Date: Mon Jun 7 17:06:18 2004 Subject: XML Search Engine References: <3.0.32.19981106075715.00b71540@pop.intergate.bc.ca> Message-ID: <3642FED8.6183FFD8@pix.com.br> Tim Bray wrote: > Well, I just went and checked their financials, and while they > actually showed a bit of profit I observe that their revenue, > for this quarter against the same quarter last year, is declining > (down from $5.4m to $4.8m) and the 3/4 results are down from $14.6m > to $14.2m. This is the high-growth Internet/Software field? I repeat > my claim that this is not a good busines to be in. > Perhaps because they have been investing a lot in a newproduct named KMS (Knowledge Management Suite) and kind of forgetting BRS itself. > >About two years ago Dataware launched EPMS, now renamed > >Dataware II Publisher. This is a version of BRS entirely based > >on SGML (it reads from about 300 different formats, converts > >and stores as an SGML file, and allows you to do text retrieval > >both in the traditional way as well as in a more SGML-like way. > > > >Of course, it can read and index directly SGML, XML and HTML. > > Can anyone else substantiate this? Last time I looked at BRS/search, > it was a very traditional atomic-document thing; it had some fielded > search, but it could only *find* documents. Obviously for XML you > need to find elements. It would be great if BRS was really > SGML/XML-savvy. Perhaps you should read again what I said. The product is not BRS/Search,it is Dataware II Publisher (formely EPMS). It is a "version" of BRS because it uses pretty much the same basic technology. On the other hand, it is entirely different because the internal format is SGML, so it allows for searchs in the SGML style (as opposed to "paragraph" qualification). As the manual says (page 136): "You can search your publication by SGML element tags. For example, you can search your publications by the TITLE element". You can also design style sheets based on the SGML tags for printing and displaying purposes. BRS itself does not know anything about SGML, but Dataware II Publisher certainly does. I have been using the product for several months now. I have a number of publications in it, including some of the Shakespeare works (markup by Jon Bosak - Moby Lexical Tools) I am not as knowledgeable as you on SGML/XML so I can not garantee 100% compatibility with SGML. What I can say is that I like what I have. To me it looks like what any XML user would like to see in a search engine. > .>Talking about money, it is quite clear that IBM made a lotof money selling > >STAIRS. Now it is musty but for more > >than 20 years it reigned undisputed undisputed in the mainframe > >kingdom. > > Here we agree. IBM made some serious money with STAIRS (I suspect > mostly by selling mainframes to run it on). > > >So, I think the right conclusion is that in the low-end line of products > >where quality/functionality is disputable and price is very low > >(PC DOCs, Verity...) there is no real money. On the other hand, > >vendors aiming the high-end market should not complain. > > I don't the evidence supports this view. > > >Own experience is that relational vendors are complete uncapableof providing a > >good solution for text retrieval. The products > >are usually very poor on the funcionality side and miserable on > >the performance side. > > That's really interesting information. This is about the 5th time > I've heard this; always anecdotal evidence, but it adds up. -Tim > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) -- Fernando Cabral Padrao iX Sistemas Abertos mailto:fernando@pix.com.br http://www.pix.com.br mailto:Pix@Pix.com.br Fone: +55 61 321-2433 Fax: +55 61 225-3082 15? 45' 04.9" S 47? 49' 58.6" W 19? 37' 57.0" S 45? 17' 13.6" W xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ralph at fsc.fujitsu.com Fri Nov 6 22:21:26 1998 From: ralph at fsc.fujitsu.com (Ralph Ferris) Date: Mon Jun 7 17:06:18 2004 Subject: HyBrick with English-language Menu Message-ID: <3.0.5.32.19981107051526.009cb870@pophost.fsc.fujitsu.com> All, The latest version of HyBrick, V.08, with XLink/XPointer support, that I announced a few days ago, is now available from FSC's Web site: http://collie.fujitsu.com/hybrick/ This version has an English-language menu, but is otherwise identical to the version on Fujitsu's Web site in Japan. I have also added some files to the distribution to show more of HyBrick's formatting capabilities. Best regards, Ralph E. Ferris Fujitsu Software Corporation (FSC) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Sat Nov 7 00:08:44 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:06:19 2004 Subject: XML Search Engine References: <3.0.32.19981105180610.00b862d0@pop.intergate.bc.ca> Message-ID: <36438EC7.7124@hiwaay.net> Tim Bray wrote: > > Lesson: there's not much juice in that business. XML might cheer > things up a bit, you never know. There are any number of decent > free search engines you can run with either Apache or NT servers... > > If you're doing relational search, most relational vendors (Oracle, > Informix, etc) have some sort of full-text add-on that usually > works OK. Thanks Tim. Yes, it is hard to make any money off of any core technology in the computer business. The hardware are toasters and the software are bread. OTOH, folks still pay a stable price for well-made sandwiches, so in the content business, (eg, the stuff in the cells), we have to find ways to let them "have it their way". Analysis is still a pretty good mustard. Thanks to everyone who sent me URLs today. I very much appreciate it. The reading has been most interesting. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Sat Nov 7 00:31:16 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:06:19 2004 Subject: XML Search Engine References: <3.0.32.19981106075715.00b71540@pop.intergate.bc.ca> Message-ID: <36439428.555E@hiwaay.net> Tim Bray wrote: > > >Own experience is that relational vendors are complete uncapableof providing a > >good solution for text retrieval. The products > >are usually very poor on the funcionality side and miserable on > >the performance side. > > That's really interesting information. This is about the 5th time > I've heard this; always anecdotal evidence, but it adds up. -Tim It would be interesting to know why. OTOH, there may be a market for add-in functionality for applications that are of smaller scale and tend to run locally. It may be that as we learned from HTML, trying to solve the ultimate problems before going to market is a way of staying poor. In the web business, and on every list I've participated in, it is always difficult to get folks to consider alternative markets for tools and content. For example, in VRML, there is an almost maniacal emphasis on the web while ignoring the CD market where the problems of "heavy" datatypes are evident. For example, any animation profits by *good sound* as can be provided in a wav file, the human voice, etc. It is only recently that some are waking up to the potential of marrying rock or pop albums to innovative 3D. Because of the need to rehost the art into new media every so often, there are lifecycle problems which open content standards help to solve just as markup technologies helped to solve these problems for long lifecycle document collections. Thar is life in them thar niches. :-) len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mda at discerning.com Sat Nov 7 01:38:16 1998 From: mda at discerning.com (Mark D. Anderson) Date: Mon Jun 7 17:06:19 2004 Subject: Storing XML documents in an OODB Message-ID: <007001be09ee$46ceffb0$0200a8c0@mdaxke.mediacity.com> Poet has *something* related to an OODB and XML; see http://www.poet.com/CMSoverview/ It seems to want to be both a smart file system for intranet publishing, and be an application platform. I'd be very curious to hear any first-hand accounts of it. -mda xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Sung_Nguyen at datacard.com Sat Nov 7 02:12:18 1998 From: Sung_Nguyen at datacard.com (Sung Nguyen) Date: Mon Jun 7 17:06:19 2004 Subject: IE50 - XML Examples Message-ID: <00155EDE.3096@datacard.com> Hi all: I have trouble with linking the following example (downloaded from MicroSoft site) This is the link error I got: "G:\mssdk\lib\ole32.lib : fatal error LNK1106: invalid file or disk full: cannot seek to 0x35cb1cf1" The two files: "mshtml.h" msxml.h" I got from Internet SDK with IE4.0 I don't know if they are still compatible with IE5.0 and where I can get new "mshtml.h" msxml.h" for IE5.0. Please point me the direction, Thanks a lot, SeanN -------------------------------------- ////////////////////////////////////////////////////////////////////// ////// // Sample1.cxx: XML Object Model Sample 1 //-------------------------------------------------------------------- ------ // Copyright (c) 1998 Microsoft Corporation. All Rights Reserved. // // THIS CODE AND INFORMATION IS PROVIDED "AS IS" WITHOUT WARRANTY OF // ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO // THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A // PARTICULAR PURPOSE. //-------------------------------------------------------------------- ------ ////////////////////////////////////////////////////////////////////// ////// #include #include #include #include #include #include #include #include "mshtml.h" #include "msxml.h" #include #define CHECKHR(x) hr = x; if (FAILED(hr)) goto Cleanup; #define SAFERELEASE(p) if (p) {(p)->Release(); p = NULL;} else ; ////////////////////////////////////////////////////////////////////// ////// // Synopsis: Create an IXMLElement of type t ////////////////////////////////////////////////////////////////////// ////// IXMLElement* CreateXMLElement(IXMLDocument* pDoc, XMLELEM_TYPE t) { IXMLElement* e; VARIANT type; type.vt = VT_I4; V_I4(&type) = t; VARIANT name; name.vt = VT_BSTR; V_BSTR(&name) = ::SysAllocString(L"ElementNode"); HRESULT hr = pDoc->createElement(type, name, &e); ::SysFreeString(V_BSTR(&name)); return e; } ////////////////////////////////////////////////////////////////////// ////// // Synopsis: Create an XML Document from Scratch in memory ////////////////////////////////////////////////////////////////////// ////// HRESULT MemDocument() { IXMLDocument *pDoc = NULL; IStream *pStm = NULL; IPersistStreamInit *pPSI = NULL; IXMLElement *enode = NULL, *el = NULL; IXMLElement *root = NULL; LARGE_INTEGER li = {0, 0}; HRESULT hr = S_OK; int i, j; // Create an empty XML document CHECKHR(CoCreateInstance(CLSID_XMLDocument, NULL, CLSCTX_INPROC_SERVER, IID_IXMLDocument, (void**)&pDoc)); // Query the IPersistStreamInit interface CHECKHR(pDoc->QueryInterface(IID_IPersistStreamInit, (void **)&pPSI)); // Create an IStream CHECKHR(CreateStreamOnHGlobal(NULL, TRUE, &pStm)); pStm->AddRef(); // // Create an xml document with a root element // ULONG ulWritten; CHECKHR(pStm->Write("", strlen(""), &ulWritten)); // load the xml document CHECKHR(pStm->Seek(li, STREAM_SEEK_SET, NULL)); CHECKHR(pPSI->Load(pStm)); // get root element CHECKHR(pDoc->get_root(&root)); // // Create an xml document in memory // for (i = 10; i > 0; i--) { enode = CreateXMLElement(pDoc, XMLELEMTYPE_ELEMENT); CHECKHR(root->addChild(enode, -1, -1)); for (j = 10; j > 0; j--) { el = CreateXMLElement(pDoc, XMLELEMTYPE_ELEMENT); CHECKHR(enode->addChild(el, -1, -1)); SAFERELEASE(el); } SAFERELEASE(enode); } CHECKHR(pStm->Seek(li, STREAM_SEEK_SET, NULL)); CHECKHR(pPSI->Save(pStm, TRUE)); // // Load the document from the in-memory stream // CHECKHR(pStm->Seek(li, STREAM_SEEK_SET, NULL)); CHECKHR(pPSI->Load(pStm)); Cleanup: SAFERELEASE(root); SAFERELEASE(el); SAFERELEASE(enode); SAFERELEASE(pPSI); SAFERELEASE(pStm); SAFERELEASE(pDoc); return hr; } int _cdecl main(int argc, char *argv[]) { HRESULT hr = S_OK; CoInitialize(NULL); hr = MemDocument(); CoUninitialize(); return hr == 0 ? 0 : 1; } xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From amitr at abinfosys.com Sat Nov 7 05:16:07 1998 From: amitr at abinfosys.com (Amit Rekhi) Date: Mon Jun 7 17:06:19 2004 Subject: How do XML NameSpace aware processors react to NS definations? Message-ID: <003b01be0a0d$1af72650$0c01a8c0@abiwebserver.abinfosys.com> I have gone through the latest XML NameSpaces Draft , but I have not been able to figure out the complete behaviour of XML NameSpace software and so the ques. below. Hello, I am confused as to how XML NameSpace aware processors would process:- - Namespace definations (eg. xmlns:edi = "http://www.my.org/directory") - Nameprefixes present in XML files (eg. ) Looking at the "xmlns" attribute value (the NameSpace Name) it is difficult to say what it signifies.eg. * xmlns:validate #FIXED "http://www.org/directory" - points to a directory of say validating programs, let's say a set of DLLs. * xmlns:xsl #FIXED "www.w3c.org/TR/WD-xsl" - points to the new XSL specification. How would an XML NS processor know what each NameSpace Name points to? Or is there some kind of hardcoding done in the processor which tells it what each NameSpace Name means? eg. if "www.w3c.org/TR/WD-xsl" is encountered , it means the XSL spec. What happens when an element name prefixed with a NS prefix is encountered? How does the XML NS processor process such names? eg. Let's say I have an xml file :- Test Data Let's also assume that http://www.my.org/checkdirectory points to a directory of validating dlls, one of which is Alpha.dll which is supposed to refer to. 1) How would an XML NS procesor know that is actually supposed to refer to Alpha.dll present in http://www.my.org/checkdirectory ? 2) I being the author of my XML file know that http://www.my.org/checkdirectory refers to a directory of validating programs, but how the XML NS processor know this? Thanks in advance for any answers, AMIT -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19981107/5de41dc5/attachment.htm From papresco at technologist.com Sun Nov 8 01:39:57 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:06:19 2004 Subject: APIs ... Re: CDATA by any other name... (was The raw and the cooked) References: <199811011654.IAA02758@sqwest.bc.ca> <363CF738.C389AB26@technologist.com> <363DCFB4.36A59E21@locke.ccil.org> <363DE9F3.2EDF9979@eng.sun.com> <363F060D.47FB0E2C@technologist.com> <3642E010.DA093709@eng.sun.com> Message-ID: <3644F4E4.300B6A55@technologist.com> David Brownell wrote: > > > I find it odd that we can have "standard APIs" for the full complexity of > > relational data, and probably eventually for object database data, but it > > is perceived to be impossible to do the same for the parse tree of XML > > data. I mean it is just annotated tree structures: it shouldn't be rocket > > science (but neither is it trivial). > > If it's just annotated tree structures, I'd say that's what DOM is for! > Or should be, warts and XML Data Model conformance aside. That's right. If the DOM were done right, it would be what I would call a "standard API" for XML data. It wouldn't be appropriate for the the abstract, arbitrarily complex data models built on top of XML. It would be for the XML data itself. But the DOM is quite far from that. It is geared towards scripters and is thus not general enough for the problems I would like to solve. > Why would I say no "standard API" would exist for the rest? There are > thousands (conservatively!) of data/object models specialized to each > application. While a tree (or grove, or graph) would seem isomorphic > with any such model, it's not necessarily optimal for any one of them. That's true. Similarly, a relational data model is not the only API in every application that *uses* a relational model, but you could say that a single relational database API (ODBC) is usually sufficient. That's all I would hope to replicate. > Similarly, "pure data" is a model many of us have been moving folk away > from over the last decade. It's critical for interoperability between > system components (e.g. over the web, with XML!), but raw data must be > joined with methods (or other code) before it's used. Ergo, "objects" > instead of APIs to data; classes not structs. Yes, there is an impedence mismatch that must be overcome between the XML data model and your application's data model, just as there is a mismatch between OO programs and relational data. A standard XML-level API (or perhaps two, one event driven and one tree driven) would help us to overcome that, not perpetuate it. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "The new revolutionaries believe the time has come for an aggressive move against our oppressors. We have established a solid beachhead on Friday. We now intend to fight vigorously for 'casual Thursdays.' -- who says America's revolutionary spirit is dead? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dave at userland.com Sun Nov 8 04:23:40 1998 From: dave at userland.com (Dave Winer) Date: Mon Jun 7 17:06:19 2004 Subject: Updated: Scripting News in XML In-Reply-To: <36407FC0.D8919AA7@technologist.com> References: <01BE07DB.D4272500@grappa.ito.tu-darmstadt.de> <364077B9.AB1AA3A6@locke.ccil.org> Message-ID: <3.0.5.32.19981107202115.00f24960@scripting.com> A funny thing happened when MSIE5/beta 2 was shipped, people are starting to read our website in XML. They're also finding problems, which we corrected today. Here are the change notes: http://nirvana.userland.com/tickets/changenotes/ticketReader$3 It's looking like we'll have some new examples of real XML-based content flow to show later this week. Dave -------------------------------------- http://www.userland.com/directory.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clovett at microsoft.com Sun Nov 8 04:35:52 1998 From: clovett at microsoft.com (Chris Lovett) Date: Mon Jun 7 17:06:19 2004 Subject: IE50 - XML Examples Message-ID: <2F2DC5CE035DD1118C8E00805FFE354C08743F9D@RED-MSG-56> You need the IE5 SDK which you can download from: http://www.microsoft.com/gallery/samples/download/first.htm -----Original Message----- From: Sung_Nguyen@datacard.com [mailto:Sung_Nguyen@datacard.com] Sent: Friday, November 06, 1998 6:10 PM To: Tyler Baker; Fernando Cabral Cc: Tim Bray; len bullard; xml-dev@ic.ac.uk; Barclay Rockwood Subject: IE50 - XML Examples Hi all: I have trouble with linking the following example (downloaded from MicroSoft site) This is the link error I got: "G:\mssdk\lib\ole32.lib : fatal error LNK1106: invalid file or disk full: cannot seek to 0x35cb1cf1" The two files: "mshtml.h" msxml.h" I got from Internet SDK with IE4.0 I don't know if they are still compatible with IE5.0 and where I can get new "mshtml.h" msxml.h" for IE5.0. Please point me the direction, Thanks a lot, SeanN -------------------------------------- ////////////////////////////////////////////////////////////////////// ////// // Sample1.cxx: XML Object Model Sample 1 //-------------------------------------------------------------------- ------ // Copyright (c) 1998 Microsoft Corporation. All Rights Reserved. // // THIS CODE AND INFORMATION IS PROVIDED "AS IS" WITHOUT WARRANTY OF // ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO // THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A // PARTICULAR PURPOSE. //-------------------------------------------------------------------- ------ ////////////////////////////////////////////////////////////////////// ////// #include #include #include #include #include #include #include #include "mshtml.h" #include "msxml.h" #include #define CHECKHR(x) hr = x; if (FAILED(hr)) goto Cleanup; #define SAFERELEASE(p) if (p) {(p)->Release(); p = NULL;} else ; ////////////////////////////////////////////////////////////////////// ////// // Synopsis: Create an IXMLElement of type t ////////////////////////////////////////////////////////////////////// ////// IXMLElement* CreateXMLElement(IXMLDocument* pDoc, XMLELEM_TYPE t) { IXMLElement* e; VARIANT type; type.vt = VT_I4; V_I4(&type) = t; VARIANT name; name.vt = VT_BSTR; V_BSTR(&name) = ::SysAllocString(L"ElementNode"); HRESULT hr = pDoc->createElement(type, name, &e); ::SysFreeString(V_BSTR(&name)); return e; } ////////////////////////////////////////////////////////////////////// ////// // Synopsis: Create an XML Document from Scratch in memory ////////////////////////////////////////////////////////////////////// ////// HRESULT MemDocument() { IXMLDocument *pDoc = NULL; IStream *pStm = NULL; IPersistStreamInit *pPSI = NULL; IXMLElement *enode = NULL, *el = NULL; IXMLElement *root = NULL; LARGE_INTEGER li = {0, 0}; HRESULT hr = S_OK; int i, j; // Create an empty XML document CHECKHR(CoCreateInstance(CLSID_XMLDocument, NULL, CLSCTX_INPROC_SERVER, IID_IXMLDocument, (void**)&pDoc)); // Query the IPersistStreamInit interface CHECKHR(pDoc->QueryInterface(IID_IPersistStreamInit, (void **)&pPSI)); // Create an IStream CHECKHR(CreateStreamOnHGlobal(NULL, TRUE, &pStm)); pStm->AddRef(); // // Create an xml document with a root element // ULONG ulWritten; CHECKHR(pStm->Write("", strlen(""), &ulWritten)); // load the xml document CHECKHR(pStm->Seek(li, STREAM_SEEK_SET, NULL)); CHECKHR(pPSI->Load(pStm)); // get root element CHECKHR(pDoc->get_root(&root)); // // Create an xml document in memory // for (i = 10; i > 0; i--) { enode = CreateXMLElement(pDoc, XMLELEMTYPE_ELEMENT); CHECKHR(root->addChild(enode, -1, -1)); for (j = 10; j > 0; j--) { el = CreateXMLElement(pDoc, XMLELEMTYPE_ELEMENT); CHECKHR(enode->addChild(el, -1, -1)); SAFERELEASE(el); } SAFERELEASE(enode); } CHECKHR(pStm->Seek(li, STREAM_SEEK_SET, NULL)); CHECKHR(pPSI->Save(pStm, TRUE)); // // Load the document from the in-memory stream // CHECKHR(pStm->Seek(li, STREAM_SEEK_SET, NULL)); CHECKHR(pPSI->Load(pStm)); Cleanup: SAFERELEASE(root); SAFERELEASE(el); SAFERELEASE(enode); SAFERELEASE(pPSI); SAFERELEASE(pStm); SAFERELEASE(pDoc); return hr; } int _cdecl main(int argc, char *argv[]) { HRESULT hr = S_OK; CoInitialize(NULL); hr = MemDocument(); CoUninitialize(); return hr == 0 ? 0 : 1; } xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From info at flosim.com Sun Nov 8 11:31:14 1998 From: info at flosim.com (Flow Simulation) Date: Mon Jun 7 17:06:20 2004 Subject: XML for resource scheduling / calender management Message-ID: <01BE0B0A.FCCCA0C0@dial-247.wiredworkplace.net> J.C.Samson wrote: > I would be interested to know if anyone on the list is aware of > any other developments in the use of XML for diary information > such as maintaining shared calendars/scheduling information. Me too. In January there was talk of a DTD for vCard and vCalendar. I haven't managed to turn up anything more about this. Is anyone aware of such a DTD? There must be a large number of applications for contact and calendaring data in XML but all I could find so far was vCard/ vCalendar and X.500. Bill Ayers (BillA@flosim.com) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lisarein at finetuning.com Sun Nov 8 15:53:10 1998 From: lisarein at finetuning.com (Lisa Rein) Date: Mon Jun 7 17:06:20 2004 Subject: No subject Message-ID: <199811081548.HAA05264@emerald.oz.net> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lisarein at finetuning.com Sun Nov 8 17:13:23 1998 From: lisarein at finetuning.com (Lisa Rein) Date: Mon Jun 7 17:06:20 2004 Subject: vcard DTD Message-ID: <3645D4AF.920EA24C@finetuning.com> for some reason my last post got screwed up. I was just trying to offer the vcard DTD that had been asked for. http://www.ietf.org/internet-drafts/draft-dawson-vcard-xml-dtd-01.txt It was originally posted in August - but now it looks like it was updated October 15th. (this and other goodies, by the way, are under the DTD resource guide at xml.com :-) thanks, lisa rein http://www.finetuning.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mda at discerning.com Sun Nov 8 17:38:27 1998 From: mda at discerning.com (Mark D. Anderson) Date: Mon Jun 7 17:06:20 2004 Subject: XML for resource scheduling / calender management Message-ID: <02d001be0b3d$03848680$0200a8c0@mdaxke.mediacity.com> The calendar WG page can be seen at http://www.imc.org/ietf-calendar/ or the ietf.org site. They currently have a vCard-like syntax. Interconversion with an XML syntax would not be hard, but nor does it seem particularly critical. They seem to be mostly working now on CAP, the protocol; the model/syntax (iCalendar) seems stable. I think they are still debating whether CAP will be an HTTP extension, or use existing HTTP POST/PUT, or be a different protocol (like SMTP/NNTP/etc.). -mda xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dave at userland.com Sun Nov 8 18:19:24 1998 From: dave at userland.com (Dave Winer) Date: Mon Jun 7 17:06:20 2004 Subject: XMLizing Folders? In-Reply-To: <02d001be0b3d$03848680$0200a8c0@mdaxke.mediacity.com> Message-ID: <3.0.5.32.19981108101732.00f18b50@scripting.com> As an experiment this morning I put together a little server component that catches references to directories that end with .xml and returns a list of objects contained in that directory, in XML of course. I've been wading thru all the spec confusion around this, and decided to approach the problem from the other side, to find out actually how much development work was involved. It took less than an hour to put this together. We have a pragmatic application for this, a customer who needs a script that checks a folder on a web server every night, to see if there are any new files added, and to get only those files. It's a pretty basic operation, and unless you want to go WebDAV, it's not very much work on either the client or server. Here's a directory you can look at thru this little server component: http://nirvana.userland.com/misc/rel200.xml Now I'll hear a crashing loud noise about namespaces and where's my DTD and all that stuff. The point is, there should be a DTD for this type of object. They appear on all of our computers, they're called folders. Where's the XML spec for folders? Dave -------------------------------------- http://www.userland.com/directory.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mda at discerning.com Sun Nov 8 18:28:24 1998 From: mda at discerning.com (Mark D. Anderson) Date: Mon Jun 7 17:06:20 2004 Subject: vcard DTD Message-ID: <02f901be0b44$7ba8e320$0200a8c0@mdaxke.mediacity.com> >for some reason my last post got screwed up. I was just trying to offer >the vcard DTD that had been asked for. >http://www.ietf.org/internet-drafts/draft-dawson-vcard-xml-dtd-01.txt I think the poster was looking for vCalendar (now iCalendar), and to my knowledge that hasn't been absorbed into the xml borg yet. The representational power and syntax of vCard is very similar however, so I'm sure the approach could be the same: parameters are mapped to XML attributes, and structured values with positional components (like a postal address) are given subelement names. I'm not yet to the point where i can speed-read DTDs, so I'm not sure how they are handling the vCard "type-grouping" capability, nor what happens when the value is actually kept in a vCard "VALUE" parameter, nor what they do with "<" or "&" in a vCard value or parameter. There's also a proposed vCard 3 extension for better handling of language; that should probably be mapped to some general xml facility (xml:lang?). I suspect there are some white space issues in there too, but I always get confused by white space discussions in XML, and this is a weekend :). -mda xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dave at userland.com Sun Nov 8 19:45:59 1998 From: dave at userland.com (Dave Winer) Date: Mon Jun 7 17:06:20 2004 Subject: XMLizing Folders? In-Reply-To: <3.0.5.32.19981108101732.00f18b50@scripting.com> References: <02d001be0b3d$03848680$0200a8c0@mdaxke.mediacity.com> Message-ID: <3.0.5.32.19981108114504.00f1bdc0@scripting.com> Here's a listing of a client script that builds on this XML interface: http://nirvana.userland.com/tickets/samples/ticketLister$14 And an explanation: http://nirvana.userland.com/tickets/samples/ticketReader$14 Dave -------------------------------------- http://www.userland.com/directory.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Nov 8 19:49:38 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:06:20 2004 Subject: XMLizing Folders? Message-ID: <3.0.32.19981108114838.00b2d100@pop.intergate.bc.ca> At 10:17 AM 11/8/98 -0800, Dave Winer wrote: >http://nirvana.userland.com/misc/rel200.xml > >Now I'll hear a crashing loud noise about namespaces and where's my DTD and >all that stuff. The point is, there should be a DTD for this type of >object. It's not obvious at all to me how a DTD would be useful in this app. On the other hand, a stylesheet would be real nice. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Sun Nov 8 23:19:24 1998 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:06:20 2004 Subject: vcard DTD References: <02f901be0b44$7ba8e320$0200a8c0@mdaxke.mediacity.com> Message-ID: <364625D9.47914EB8@eng.sun.com> "Mark D. Anderson" wrote: > > nor what happens when the value is actually kept in a vCard "VALUE" parameter, > nor what they do with "<" or "&" in a vCard value or parameter. So what's the issue here? Dave Winer mentioned a similar issue that forced him to encode some text in "base64" ... Are XML-emitting tools just being incorrect? "<" in text should always be encoded as "<" (it'll be un-encoded during parsing), and "&" as "&" (ditto). - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Sun Nov 8 23:27:00 1998 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:06:20 2004 Subject: XMLizing Folders? References: <3.0.32.19981108114838.00b2d100@pop.intergate.bc.ca> Message-ID: <36462789.95AE9086@eng.sun.com> Tim Bray wrote: > > At 10:17 AM 11/8/98 -0800, Dave Winer wrote: > >http://nirvana.userland.com/misc/rel200.xml > > > >Now I'll hear a crashing loud noise about namespaces and where's my DTD and > >all that stuff. The point is, there should be a DTD for this type of > >object. > > It's not obvious at all to me how a DTD would be useful in this app. On > the other hand, a stylesheet would be real nice. -Tim Actually, I think a DTD could be very useful. The issue is that most of what goes into a "directory" is domain specific metadata; even for files, different operating systems give very different metadata. Is there an Access Control List? If so, is it POSIX? What to do with the entries? What, no "hard link count"? Is case significant? Etc. Conclusion being that there will probably be a core DTD and bunches of domain-specific addons, each with their own namespaces. Re crashing sounds, I was thinking about the security implications. Browsing any directory -- filesystem, name service, phone book, employee database, database catalog, etc -- is security-sensitive. Yep, it'd hardly take any time at all to put together a servlet that works this way ... ;-) - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dave at userland.com Sun Nov 8 23:40:13 1998 From: dave at userland.com (Dave Winer) Date: Mon Jun 7 17:06:20 2004 Subject: XMLizing Folders? In-Reply-To: <36462789.95AE9086@eng.sun.com> References: <3.0.32.19981108114838.00b2d100@pop.intergate.bc.ca> Message-ID: <3.0.5.32.19981108153910.00f34440@scripting.com> >>Re crashing sounds, I was thinking about the security implications. Browsing any directory -- filesystem, name service, phone book, employee database, database catalog, etc -- is security-sensitive. That's a server issue. The server sysop decides which resources should be accessible this way and by whom. HTTP and specific servers already have provisions for this, and it has zero effect on the XML used to transmit the contents of the folder, the two are completely independent issues. Re having a style sheet, I could see where that would be useful if you wanted to look at these structures in a web browser, and I'm sure someone will do one, or five people will do five, and they'll all look like an outliner. Aside from that, I don't want to browse these things, I want scripts to be able to walk the structures. How it displays is almost irrelevant in that context. I also got a thoughtful email from Alex Hopmann at Microsoft who's been working on WebDAV, and now I have a comparison between this and what DAV is doing, so I may see how to fit that into our framework, which was one of my goals in doing this work, to see if it really had to be so complicated and to see where it fits in. He says he's going to write a WebDAV tutorial, and *that* would really be something. We need more HOWTOs in this world, fewer specs, more working code, more things to be compatible with. I know that's my usual story, but I still believe it's true! (We're making progress, IE5/Beta 2 changes things quite a bit.) Dave -------------------------------------- http://www.userland.com/directory.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mda at discerning.com Sun Nov 8 23:44:12 1998 From: mda at discerning.com (Mark D. Anderson) Date: Mon Jun 7 17:06:20 2004 Subject: vcard DTD Message-ID: <03a001be0b70$b1366f90$0200a8c0@mdaxke.mediacity.com> >> nor what they do with "<" or "&" in a vCard value or parameter. > >So what's the issue here? Dave Winer mentioned a similar issue >that forced him to encode some text in "base64" ... > >Are XML-emitting tools just being incorrect? "<" in text should >always be encoded as "<" (it'll be un-encoded during parsing), >and "&" as "&" (ditto). I don't think there are any deep or hard issues here, and I have no idea what tools do; my observation was about the spec -- and a spec that talks about exchange between two syntaxes (syntaces?) should be very explicit about encoding issues. This means consideration of the (different) sets of special characters in the two syntaxes, and the (different) mechanisms each has for specifying binary encoding, character escaping, and so on. For example, there are similar issues in reverse, if an xml-encoded vCard had a postal address element cdata with a semi-colon in it. These are easy to surmount, but the issues should be explicitly identified. BTW, while substitution with < is certainly the simplest approach, I could imagine scenarios where a structured xml value for a vCard NOTE value might be a nice thing. Admittedly, I'm stretching. -mda p.s.: now that i think about it, transcoding a vCard also introduces the question of what should be done with vCard PRODID. Is anyone from vCard/iCalendar on this list? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dave at userland.com Mon Nov 9 00:32:49 1998 From: dave at userland.com (Dave Winer) Date: Mon Jun 7 17:06:20 2004 Subject: vcard DTD In-Reply-To: <364625D9.47914EB8@eng.sun.com> References: <02f901be0b44$7ba8e320$0200a8c0@mdaxke.mediacity.com> Message-ID: <3.0.5.32.19981108163054.00f16e30@scripting.com> >>Dave Winer mentioned a similar issue that forced him to encode some text in "base64" If an XML value contains a < or an & you should encode them as < and &. We researched this and it was there in one of the specs. (Doug Baron will correct me if I got this wrong). The situation I was writing about was truly bizarre. There were two levels of decoding going on. First I was changing the "Mark D. Anderson" wrote: >> >> nor what happens when the value is actually kept in a vCard "VALUE" parameter, >> nor what they do with "<" or "&" in a vCard value or parameter. > >So what's the issue here? ... > >Are XML-emitting tools just being incorrect? "<" in text should >always be encoded as "<" (it'll be un-encoded during parsing), >and "&" as "&" (ditto). > >- Dave > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > -------------------------------------- http://www.userland.com/directory.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Mon Nov 9 09:30:53 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:06:20 2004 Subject: Storing XML documents in an OODB Message-ID: <001f01be0bc3$09e7d870$7008e391@bra01wmhkay.bra01.icl.co.uk> >I want to write a Java- application which stores and retrieves XML >documents in/from an object-oriented Database (Object-Store) This is quite feasible to do, we have done similar things with the Jasmine database. But a word of warning: if you do it mechanistically (each DOM object becomes one database object) you will probably get poor performance. You need to ask whether you really need to store data at this level of granularity. If you generally end up retrieving most of the original document, you would probably be better off to store the original XML as a string and re-parse it. At any rate, do some experiments before you commit yourself too far. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Mon Nov 9 09:56:17 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:06:20 2004 Subject: XML Search Engine Message-ID: <002701be0bc6$9f347d40$7008e391@bra01wmhkay.bra01.icl.co.uk> Tim Bray: >Can anyone else substantiate this? Last time I looked at BRS/search, >it was a very traditional atomic-document thing; it had some fielded >search, but it could only *find* documents. Obviously for XML you >need to find elements. Not obvious at all. Search engines have always primarily been in the business of finding documents, I can't see why XML changes this. Of course it is necessary to present the search engine with "documents" at an appropriate level of granularity, which is not necessarily the original XML source document; a filter can do this. >>Own experience is that relational vendors are completely uncapable of providing a >>good solution for text retrieval. My view is that the relational products are quite useful in hybrid environments, where the data is mostly structured but includes some lengthy text fields, e.g. product descriptions in a product database. Their main limitation is that they take a strictly boolean view of the world: as with SQL, each record is either a match or it isn't. That's been discredited in information retrieval research for 25 years, though it is still found in many products, especially at the low end of the market. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ssahuc at netaway.com Mon Nov 9 10:59:45 1998 From: ssahuc at netaway.com (Sebastien Sahuc) Date: Mon Jun 7 17:06:20 2004 Subject: XML and Internationalization... Message-ID: <008501be0bcf$d11e43f0$10c809c0@corba.netaway> Hi there, I wonder if there is any specifications available in making XML documents multilingual. For instance, I have a xml document that specifies some operations available to the user (the selection of the operation to call is done through a GUI). As a small description is associated to each operation, it should be useful to give the user this description in the custom language. My first thought will be the following: ... The factorial method La m?thode factorielle La operac?on factorial The number to pass in Le nombre ? passer El n?mero a calcular ... Does anybody has a suggestion auround the syntax ? Any comment would be greatly appreciate. Thanks, Sebastien SAHUC E-mail : ssahuc@netaway.com _______________________________________________________ NETAWAY Document Oriented Computing http://www.netaway.com 6, Bd du G?n?ral Leclerc Tel : +33 01 55 46 95 20 92115 CLICHY CEDEX Fax : +33 01 55 46 95 29 FRANCE _______________________________________________________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sun Nov 8 01:39:57 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:06:20 2004 Subject: APIs ... Re: CDATA by any other name... (was The raw and the cooked) References: <199811011654.IAA02758@sqwest.bc.ca> <363CF738.C389AB26@technologist.com> <363DCFB4.36A59E21@locke.ccil.org> <363DE9F3.2EDF9979@eng.sun.com> <363F060D.47FB0E2C@technologist.com> <3642E010.DA093709@eng.sun.com> Message-ID: <3644F4E4.300B6A55@technologist.com> David Brownell wrote: > > > I find it odd that we can have "standard APIs" for the full complexity of > > relational data, and probably eventually for object database data, but it > > is perceived to be impossible to do the same for the parse tree of XML > > data. I mean it is just annotated tree structures: it shouldn't be rocket > > science (but neither is it trivial). > > If it's just annotated tree structures, I'd say that's what DOM is for! > Or should be, warts and XML Data Model conformance aside. That's right. If the DOM were done right, it would be what I would call a "standard API" for XML data. It wouldn't be appropriate for the the abstract, arbitrarily complex data models built on top of XML. It would be for the XML data itself. But the DOM is quite far from that. It is geared towards scripters and is thus not general enough for the problems I would like to solve. > Why would I say no "standard API" would exist for the rest? There are > thousands (conservatively!) of data/object models specialized to each > application. While a tree (or grove, or graph) would seem isomorphic > with any such model, it's not necessarily optimal for any one of them. That's true. Similarly, a relational data model is not the only API in every application that *uses* a relational model, but you could say that a single relational database API (ODBC) is usually sufficient. That's all I would hope to replicate. > Similarly, "pure data" is a model many of us have been moving folk away > from over the last decade. It's critical for interoperability between > system components (e.g. over the web, with XML!), but raw data must be > joined with methods (or other code) before it's used. Ergo, "objects" > instead of APIs to data; classes not structs. Yes, there is an impedence mismatch that must be overcome between the XML data model and your application's data model, just as there is a mismatch between OO programs and relational data. A standard XML-level API (or perhaps two, one event driven and one tree driven) would help us to overcome that, not perpetuate it. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "The new revolutionaries believe the time has come for an aggressive move against our oppressors. We have established a solid beachhead on Friday. We now intend to fight vigorously for 'casual Thursdays.' -- who says America's revolutionary spirit is dead? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Mon Nov 9 11:35:25 1998 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:06:20 2004 Subject: How do XML NameSpace aware processors react to NS definations? Message-ID: <01BE0BDC.BD3B8DE0@grappa.ito.tu-darmstadt.de> Amit Rekhi wrote: > I am confused as to how XML NameSpace aware processors would process:- > > - Namespace definations (eg. xmlns:edi = "http://www.my.org/directory") > - Nameprefixes present in XML files (eg. ) The namespace URL does not signify anything at all. It is not a directory, schema file, or anything else. *All* it is is a unique identifier that identifies the namespace. The prefix is a convenient shorthand for the full URL, nothing more. When reading the namespace spec, it is important to understand that it tries to solve a single problem: ensuring that element and attribute names are unique. This is significant when XML documents that use different DTDs are combined. However, the namespace spec does no more than this. In particular, it does not attempt to answer the problems of combining DTDs, combining instance files, or anything else. Unique names. That's it. What your application does with those names is your business. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sun Nov 8 01:39:57 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:06:21 2004 Subject: APIs ... Re: CDATA by any other name... (was The raw and the cooked) References: <199811011654.IAA02758@sqwest.bc.ca> <363CF738.C389AB26@technologist.com> <363DCFB4.36A59E21@locke.ccil.org> <363DE9F3.2EDF9979@eng.sun.com> <363F060D.47FB0E2C@technologist.com> <3642E010.DA093709@eng.sun.com> Message-ID: <3644F4E4.300B6A55@technologist.com> David Brownell wrote: > > > I find it odd that we can have "standard APIs" for the full complexity of > > relational data, and probably eventually for object database data, but it > > is perceived to be impossible to do the same for the parse tree of XML > > data. I mean it is just annotated tree structures: it shouldn't be rocket > > science (but neither is it trivial). > > If it's just annotated tree structures, I'd say that's what DOM is for! > Or should be, warts and XML Data Model conformance aside. That's right. If the DOM were done right, it would be what I would call a "standard API" for XML data. It wouldn't be appropriate for the the abstract, arbitrarily complex data models built on top of XML. It would be for the XML data itself. But the DOM is quite far from that. It is geared towards scripters and is thus not general enough for the problems I would like to solve. > Why would I say no "standard API" would exist for the rest? There are > thousands (conservatively!) of data/object models specialized to each > application. While a tree (or grove, or graph) would seem isomorphic > with any such model, it's not necessarily optimal for any one of them. That's true. Similarly, a relational data model is not the only API in every application that *uses* a relational model, but you could say that a single relational database API (ODBC) is usually sufficient. That's all I would hope to replicate. > Similarly, "pure data" is a model many of us have been moving folk away > from over the last decade. It's critical for interoperability between > system components (e.g. over the web, with XML!), but raw data must be > joined with methods (or other code) before it's used. Ergo, "objects" > instead of APIs to data; classes not structs. Yes, there is an impedence mismatch that must be overcome between the XML data model and your application's data model, just as there is a mismatch between OO programs and relational data. A standard XML-level API (or perhaps two, one event driven and one tree driven) would help us to overcome that, not perpetuate it. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "The new revolutionaries believe the time has come for an aggressive move against our oppressors. We have established a solid beachhead on Friday. We now intend to fight vigorously for 'casual Thursdays.' -- who says America's revolutionary spirit is dead? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tms at ansa.co.uk Mon Nov 9 12:14:08 1998 From: tms at ansa.co.uk (Toby Speight) Date: Mon Jun 7 17:06:21 2004 Subject: XML and Internationalization... In-Reply-To: "Sebastien Sahuc"'s message of "Mon, 9 Nov 1998 11:57:25 +0100" References: <008501be0bcf$d11e43f0$10c809c0@corba.netaway> Message-ID: Sebastien> Sebastien Sahuc 0> In article <008501be0bcf$d11e43f0$10c809c0@corba.netaway>, 0> Sebastien wrote: Sebastien> My first thought will be the following: Sebastien> ... Sebastien> Sebastien> Sebastien> The factorial method Sebastien> La m?thode factorielle Sebastien> La operac?on factorial Sebastien> Sebastien> Sebastien> The number to pass in Sebastien> Le nombre ? passer Sebastien> El n?mero a calcular Sebastien> Sebastien> Sebastien> Sebastien> ... That's one way of doing it, though it seems to be more common to have a separate document for each language. The latter approach makes it easier to add new translations with minimum disturbance to existing work (in particular, it reduces the likelihood of check-out conflicts in a source control system), and it means that only the required language mappings need be copied during software installation, reducing disk space requirements for end-users. You might want to use the XML-defined xml:lang attribute for your language labels, BTW, with the usual Internet language codes: xml:lang="en-GB" or xml:lang="fr-CA". -- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sun Nov 8 01:39:57 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:06:21 2004 Subject: APIs ... Re: CDATA by any other name... (was The raw and the cooked) References: <199811011654.IAA02758@sqwest.bc.ca> <363CF738.C389AB26@technologist.com> <363DCFB4.36A59E21@locke.ccil.org> <363DE9F3.2EDF9979@eng.sun.com> <363F060D.47FB0E2C@technologist.com> <3642E010.DA093709@eng.sun.com> Message-ID: <3644F4E4.300B6A55@technologist.com> David Brownell wrote: > > > I find it odd that we can have "standard APIs" for the full complexity of > > relational data, and probably eventually for object database data, but it > > is perceived to be impossible to do the same for the parse tree of XML > > data. I mean it is just annotated tree structures: it shouldn't be rocket > > science (but neither is it trivial). > > If it's just annotated tree structures, I'd say that's what DOM is for! > Or should be, warts and XML Data Model conformance aside. That's right. If the DOM were done right, it would be what I would call a "standard API" for XML data. It wouldn't be appropriate for the the abstract, arbitrarily complex data models built on top of XML. It would be for the XML data itself. But the DOM is quite far from that. It is geared towards scripters and is thus not general enough for the problems I would like to solve. > Why would I say no "standard API" would exist for the rest? There are > thousands (conservatively!) of data/object models specialized to each > application. While a tree (or grove, or graph) would seem isomorphic > with any such model, it's not necessarily optimal for any one of them. That's true. Similarly, a relational data model is not the only API in every application that *uses* a relational model, but you could say that a single relational database API (ODBC) is usually sufficient. That's all I would hope to replicate. > Similarly, "pure data" is a model many of us have been moving folk away > from over the last decade. It's critical for interoperability between > system components (e.g. over the web, with XML!), but raw data must be > joined with methods (or other code) before it's used. Ergo, "objects" > instead of APIs to data; classes not structs. Yes, there is an impedence mismatch that must be overcome between the XML data model and your application's data model, just as there is a mismatch between OO programs and relational data. A standard XML-level API (or perhaps two, one event driven and one tree driven) would help us to overcome that, not perpetuate it. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "The new revolutionaries believe the time has come for an aggressive move against our oppressors. We have established a solid beachhead on Friday. We now intend to fight vigorously for 'casual Thursdays.' -- who says America's revolutionary spirit is dead? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From majayg at iitk.ac.in Mon Nov 9 12:37:28 1998 From: majayg at iitk.ac.in (Ajay Gangwar) Date: Mon Jun 7 17:06:21 2004 Subject: Converting HTML to well formed XML Message-ID: <001f01be0bdd$0e857da0$52a21090@cse.iitk.ernet.in> I need to convert HTML document to well formed XML document. Can someone please guide me how to go about doing this? Are any utilities available for this? - Ajay Gangwar majayg@iitk.ac.in xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Mon Nov 9 12:56:44 1998 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:06:21 2004 Subject: RXP-on-the-web now validates Message-ID: <5273.199811091256@doyle.cogsci.ed.ac.uk> The RXP-based XML checker at http://www.cogsci.ed.ac.uk/~richard/xml-check.html now has an option to validate as well as check for well-formedness. A few minor aspects of validity are not checked, such as PE nesting. -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From north at Synopsys.COM Mon Nov 9 13:19:47 1998 From: north at Synopsys.COM (Simon North) Date: Mon Jun 7 17:06:21 2004 Subject: XML and FrameMaker 5.5.6 Message-ID: <199811091316.OAA13053@goofy.gr05.synopsys.com> Not a development issue, I know, but I thought it would make a pleasant change to be able to report something positive. I've been playing with FrameMaker 5.5.6 this morning, which includes a "Save as XML" function. The XML code it produces is well-formed and appears to be 100% compliant (no parse errors in most tools). FM outputs a CSS stylesheet (I suppose XSL would have been too much to ask), but the code display in IE5 Beta PR2 is very good. Of course, while FM creates perfect XML code for linked images: IE5 hasn't got a clue what to do with this. Even cross-references are handled nicely, FrameMaker outputing code of this form: but again, IE5 doesn't have much of a clue what to do with this. Tables are also, as you could have guessed, a mess. Despite IE's deficiencies, FrameMaker 5.5.6 is definitely a giant step forward in the right direction. Simon North. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From majayg at iitk.ac.in Mon Nov 9 13:50:46 1998 From: majayg at iitk.ac.in (Ajay Gangwar) Date: Mon Jun 7 17:06:21 2004 Subject: Converting HTML to well formed XML Message-ID: <006501be0be7$78770710$52a21090@cse.iitk.ernet.in> (I apologise if you see this twice. The first mail that I sent was bounced back to me midway.) I need to convert HTML document to well formed XML document. Can someone please guide me how to go about doing this? Are any utilities available for this? - Ajay Gangwar majayg@iitk.ac.in xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Mon Nov 9 14:05:37 1998 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:06:21 2004 Subject: Converting HTML to well formed XML In-Reply-To: <006501be0be7$78770710$52a21090@cse.iitk.ernet.in> Message-ID: <199811091405.JAA12602@hesketh.com> At 07:17 PM 11/9/98 +0530, Ajay Gangwar wrote: >I need to convert HTML document to well formed XML >document. Can someone please guide me how to go >about doing this? Are any utilities available for this? > >- Ajay Gangwar > majayg@iitk.ac.in I'm in the process of converting my site (http://www.simonstl.com) to well-formed syntax, though still using an HTML vocabulary. I'm keeping sort of a diary at http://www.simonstl.com/projects/html2xml/ - it lists some helpful utilities for cleaning up the code, like Dave Raggett's TIDY, and XML.com's RUWF well-formedness checker. Unfortunately, no one yet (so far as I know) has created a friendly one-step legacy HTML->well-formed XML syntax HTML converter. That's a nice opportunity there for some publicity, if not necessarily $$$... Of course, if you want to to use non-HTML vocabulary, you're talking about something much larger than syntax, and I'd recommend investing in a book - my _XML: A Primer_, John Simpson's _Just XML_, or Elliotte Rusty Harold's _XML: Extensible Markup Language_ to get started. Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer Cookies / Sharing Bandwidth (November) Building XML Applications (December) http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From deke at tallent.com Mon Nov 9 14:38:11 1998 From: deke at tallent.com (Deke Smith) Date: Mon Jun 7 17:06:21 2004 Subject: XML and Internationalization... Message-ID: <1301522280-41559049@tallent.com> Sebastien Sahuc, ssahuc@netaway.com said on 11/9/98 4:57 AM: >Hi there, > >I wonder if there is any specifications available in making XML documents >multilingual. > >For instance, I have a xml document that specifies some operations available >to the user (the selection of the operation to call is done through a GUI). >As a small description is associated to each operation, it should be useful >to give the user this description in the custom language. > >My first thought will be the following: >... > > > The factorial method > La m?thode factorielle > La operac?on factorial > > > The number to pass in > Le nombre ? passer > El n?mero a calcular > > > >... > >Does anybody has a suggestion auround the syntax ? Any comment would be >greatly appreciate. > I would suggest one change to your syntax: there is a standard tag for language specification in XML. That tag is "xml:lang" and it uses ISO 639 language codes. You can find the language codes at < http://www.isoc.org:8080/langues/iso639.fr.htm>. Your sample would look like: ... The factorial method La m?thode factorielle La operac?on factorial The number to pass in Le nombre ? passer El n?mero a calcular ... In the XML spec it talks a little bit more about language specification: A MUCH more complicated translation format can be found from LISA at: . This may or may not be overkill for what you are doing. Deke ----------------------------------------------------------------- Deke Smith Tallent Communications Group, Brentwood TN deke@tallent.com, 615-661-9878 ----------------------------------------------------------------- " The best way to predict the future is to invent it. " - Alan Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Nov 9 14:55:22 1998 From: david at megginson.com (david@megginson.com) Date: Mon Jun 7 17:06:21 2004 Subject: XML and Internationalization... In-Reply-To: <1301522280-41559049@tallent.com> References: <1301522280-41559049@tallent.com> Message-ID: <13894.65527.147246.745448@localhost.localdomain> Deke Smith writes: > I would suggest one change to your syntax: there is a standard tag > for language specification in XML. That tag is "xml:lang" and it > uses ISO 639 language codes. You can find the language codes at < > http://www.isoc.org:8080/langues/iso639.fr.htm>. One important point to note is that by itself, the 'xml:lang' attribute simply indicates the language of the content and attribute values -- it does not suggest that sibling elements with different xml:lang values either are or are not equivalents in other languages. For example, I could have Montréal London Roma where the three elements are clearly not alternate versions of the same thing. > > > The factorial method > La m?thode factorielle > La operac?on factorial > > > The number to pass in > Le nombre ? passer > El n?mero a calcular > > > > .... This example is perfectly reasonable, as long as you remember that the idea that the description elements represent the same thing in different languages is derived from the document type (or vocabulary), not from the 'xml:lang' attribute itself. As far as 'xml:lang' is concerned, you could just as easily have The méthode factorial where all three make up the same description. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tgraham at mulberrytech.com Mon Nov 9 16:10:25 1998 From: tgraham at mulberrytech.com (Tony Graham) Date: Mon Jun 7 17:06:21 2004 Subject: XML and Internationalization... In-Reply-To: <1301522280-41559049@tallent.com> References: <1301522280-41559049@tallent.com> Message-ID: At 9 Nov 1998 08:30 -0600, Deke Smith wrote: > A MUCH more complicated translation format can be found from LISA at: > . This may or may not be overkill for what you are doing. While you may find concepts from the TMX work that are useful to you, TMX stands for Translation Memory eXchange, and is concerned with importing and exporting portions of translation memory -- phrases that have been translated once and saved so they don't need to be translated again -- between translation tools. TMX has structures for parallel portions of text in multiple languages, but there is no concept that these chunks of text can, should, or will string together to make a coherent "document", in anybody's sense of the word. The only markup in a TMX document, which is in XML, is concerned with delimiting and identifying the parallel chunks of text for the purposes of the translation tool: other markup from the source document may be saved in the TMX document (with significant XML characters escaped with entities) but only as a translation aid for those tools that can use it. Regards, Tony Graham ====================================================================== Tony Graham mailto:tgraham@mulberrytech.com Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9632 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ====================================================================== xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mda at discerning.com Mon Nov 9 17:15:34 1998 From: mda at discerning.com (Mark D. Anderson) Date: Mon Jun 7 17:06:21 2004 Subject: XML and Internationalization... Message-ID: <05b301be0c03$536389a0$0200a8c0@mdaxke.mediacity.com> >One important point to note is that by itself, the 'xml:lang' >attribute simply indicates the language of the content and attribute >values -- it does not suggest that sibling elements with different >xml:lang values either are or are not equivalents in other languages. > [...] > > > > The number to pass in > > Le nombre ? passer > > El n?mero a calcular > > Good point. Is there some way this could be done with XLink/XPointer, whereby string resources could be kept externally, so that one file has this: and another file (or possibly the same file) has the definitions of the string for the different languages. Obviously, an xml app could define whatever semantics it likes, but i'm wondering if there is enough mechanism in the standards to do it at parse time. Ideally, xml:link and show could be defaulted, and the href would only have to have the id ("#Nbr"). a binding from "above" would set xml:lang and set the base URL for the strings file. XPointer would know to seek out the lang=fr variant of the pointed-to resource if xml:lang was set to "fr". I'm sure all my specifics are screwed up, but hopefully you get the idea. Next, I want to inline a CDATA english string to act as default if i don't find a language-specific one externally.... -mda xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul.spencer at boynings.demon.co.uk Mon Nov 9 18:08:11 1998 From: paul.spencer at boynings.demon.co.uk (Paul Spencer) Date: Mon Jun 7 17:06:21 2004 Subject: XML and IE5 beta PR2 In-Reply-To: Message-ID: <000001be0c0b$e89672a0$020110ac@office> >From: Jon.Bosak@eng.Sun.COM (Jon Bosak) >Date: Thu, 5 Nov 1998 20:13:06 -0800 > >It's my impression that Microsoft sees XSL simply as a way to do tag >transformation and that their strategy for XML display is to use XSL >to transform XML tags to HTML tags. This means, of course, that you >will not be able to use Microsoft tools that support XSL to do any >formatting more complex than what can be expressed using HTML+CSS. I agree that the current IE5 implementation of XSL only does tag transformation. However, unless I have misunderstood CSS, I don't think your conclusion follows. I have used IE5 XSL to do reasonably complex formatting of XML documents, including taking elements out of order, and displaying images (URL in the XML) with hyperlinks to documents (also with their URL in the XML). With a bit of judicious pre-processing of the XML using the DOM, I also display another image a number of times according to a number in the XML document. I don't think I could do any of this with CSS. My understanding is that CSS only lets you apply styles to what is there, in the order it is in the XML document, XSL lets you move elements around and style them, but you need the DOM to manipulate the PCDATA. This gives three layers of functionality, each increasing in complexity of implementation. Correct? Mind you, I don't think MS is correctly implementing the August 18 draft, but that is another matter, and I could be wrong. Paul Spencer Boynings Consulting Tel: +44 (0)1628 687010 Fax: +44 (0)1628 687011 http://www.boynings.demon.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From deke at tallent.com Mon Nov 9 18:45:43 1998 From: deke at tallent.com (Deke Smith) Date: Mon Jun 7 17:06:21 2004 Subject: XML and Internationalization... Message-ID: <1301506998-42478233@tallent.com> david@megginson.com, david@megginson.com said on 11/9/98 8:54 AM: >One important point to note is that by itself, the 'xml:lang' >attribute simply indicates the language of the content and attribute >values -- it does not suggest that sibling elements with different >xml:lang values either are or are not equivalents in other languages. >For example, I could have > > > Montréal > London > Roma > > Tony Graham, tgraham@mulberrytech.com said on 11/9/98 10:09 AM: >While you may find concepts from the TMX work that are useful to you, >TMX stands for Translation Memory eXchange, and is concerned with >importing and exporting portions of translation memory -- phrases that >have been translated once and saved so they don't need to be >translated again -- between translation tools. TMX has structures for >parallel portions of text in multiple languages, but there is no >concept that these chunks of text can, should, or will string together >to make a coherent "document", in anybody's sense of the word. The >only markup in a TMX document, which is in XML, is concerned with >delimiting and identifying the parallel chunks of text for the >purposes of the translation tool: other markup from the source >document may be saved in the TMX document (with significant XML >characters escaped with entities) but only as a translation aid for >those tools that can use it. I have created phrase "substitution" scripts in Frontier and XML and ran into the same problem. I wanted to be able to "translate" phrases or words for use in multi-lingual Websites. It translates in the roughest sense: "Hello World!"=="?Hola Mundo!"=="?Bonjour Monde!". I created my own translation DTD (I don't know of simple ones that may exist) -- and I think it shows, as David pointed out, that XML only provides a framework and the processing program has to provide an additional amount of structure not found in the DTD. Under my dirty little DTD (built by necessity), the "Hello World!" example would be: ?Bonjour Monde! ?Hola Mundo! Hallo Welt! This is a private DTD, so in my little world I know that the ID attribute of the PHRASE element equals the text nodes of the TRANSLATION elements. It would be asking too much of XML to enforce this structure. TMX does provide this sort of function and structure, doesn't it? Here is how I would translate the previous example in TMX:

Hello world! ?Bonjour Monde! ?Hola Mundo! Hallo Welt! Here's my question: As I understand it, TMX is a format for translation "dictionaries" -- or lists of equivalent words, phrases, sentences or paragraphs in different languages. TMX also allows the preservation of formating within phrases, such as boldface, italic, etc. I always judge tools by what *I* need from them and that is what I need from TMX. Is it meant to do more than what I have asked it to do? Is this "dictionary" concept something TMX is *meant* for? I am under the impression that TMX can also have embedded "macros" within phrases. By "macro", I mean processing commands that may be understood only by a specific scripting language. Am I right? Deke ----------------------------------------------------------------- Deke Smith Tallent Communications Group, Brentwood TN deke@tallent.com, 615-661-9878 ----------------------------------------------------------------- " The best way to predict the future is to invent it. " - Alan Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Nov 9 18:47:15 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:22 2004 Subject: Scheme (was: Walking the DOM) References: <3.0.32.19981103102617.00afddb0@pop.intergate.bc.ca> <363F55AB.AEA4EB19@locke.ccil.org> Message-ID: <364738C3.5652BA70@locke.ccil.org> Russ McManus wrote: > My understanding is that R5RS includes 'do' in the language, which > is a primitive, iterative construct. "do" is iterative but not primitive: it is defined in R5RS using a high-level macro (now the only kind of macro), using tail iteration. The *primitive* constructs are application, "if", "set!", "lambda", and variable reference. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Sung_Nguyen at datacard.com Mon Nov 9 18:48:10 1998 From: Sung_Nguyen at datacard.com (Sung Nguyen) Date: Mon Jun 7 17:06:22 2004 Subject: IE50 - XML Examples Message-ID: <0015722D.3096@datacard.com> Hi Chris and all: Following steps have been done at my NT4.0 workstation: 0. Install IE5.0 Beta 1. Install SP3 for Visual C++ 5.0 2. Install lastest MicroSoft Platform SDK (Sept. 98) 3. Download from http://www.microsoft.com/gallery/samples/download/first.htm STILL, I cannot link the example. Errors as follows: G:\mssdk\lib\kernel32.lib fatal error LNK1106: invalid file or disk full: cannot seek to 0x35cd5580 G:\mssdk\lib\ole32.lib : fatal error LNK1106: invalid file or disk full: cannot seek to 0x35cb1cf1 Error executing link.exe. Can you see something that I missed or I need to to something else to get the samples running? Or I have to use NT5.0 beta ? Please helps, SeanN ______________________________ Reply Separator _________________________________ Subject: RE: IE50 - XML Examples Author: Chris Lovett at Internet Date: 11/7/98 8:34 PM You need the IE5 SDK which you can download from: http://www.microsoft.com/gallery/samples/download/first.htm -----Original Message----- From: Sung_Nguyen@datacard.com [mailto:Sung_Nguyen@datacard.com] Sent: Friday, November 06, 1998 6:10 PM To: Tyler Baker; Fernando Cabral Cc: Tim Bray; len bullard; xml-dev@ic.ac.uk; Barclay Rockwood Subject: IE50 - XML Examples Hi all: I have trouble with linking the following example (downloaded from MicroSoft site) This is the link error I got: "G:\mssdk\lib\ole32.lib : fatal error LNK1106: invalid file or disk full: cannot seek to 0x35cb1cf1" The two files: "mshtml.h" msxml.h" I got from Internet SDK with IE4.0 I don't know if they are still compatible with IE5.0 and where I can get new "mshtml.h" msxml.h" for IE5.0. Please point me the direction, Thanks a lot, SeanN -------------------------------------- ////////////////////////////////////////////////////////////////////// ////// // Sample1.cxx: XML Object Model Sample 1 //-------------------------------------------------------------------- ------ // Copyright (c) 1998 Microsoft Corporation. All Rights Reserved. // // THIS CODE AND INFORMATION IS PROVIDED "AS IS" WITHOUT WARRANTY OF // ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO // THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A // PARTICULAR PURPOSE. //-------------------------------------------------------------------- ------ ////////////////////////////////////////////////////////////////////// ////// #include #include #include #include #include #include #include #include "mshtml.h" #include "msxml.h" #include #define CHECKHR(x) hr = x; if (FAILED(hr)) goto Cleanup; #define SAFERELEASE(p) if (p) {(p)->Release(); p = NULL;} else ; ////////////////////////////////////////////////////////////////////// ////// // Synopsis: Create an IXMLElement of type t ////////////////////////////////////////////////////////////////////// ////// IXMLElement* CreateXMLElement(IXMLDocument* pDoc, XMLELEM_TYPE t) { IXMLElement* e; VARIANT type; type.vt = VT_I4; V_I4(&type) = t; VARIANT name; name.vt = VT_BSTR; V_BSTR(&name) = ::SysAllocString(L"ElementNode"); HRESULT hr = pDoc->createElement(type, name, &e); ::SysFreeString(V_BSTR(&name)); return e; } ////////////////////////////////////////////////////////////////////// ////// // Synopsis: Create an XML Document from Scratch in memory ////////////////////////////////////////////////////////////////////// ////// HRESULT MemDocument() { IXMLDocument *pDoc = NULL; IStream *pStm = NULL; IPersistStreamInit *pPSI = NULL; IXMLElement *enode = NULL, *el = NULL; IXMLElement *root = NULL; LARGE_INTEGER li = {0, 0}; HRESULT hr = S_OK; int i, j; // Create an empty XML document CHECKHR(CoCreateInstance(CLSID_XMLDocument, NULL, CLSCTX_INPROC_SERVER, IID_IXMLDocument, (void**)&pDoc)); // Query the IPersistStreamInit interface CHECKHR(pDoc->QueryInterface(IID_IPersistStreamInit, (void **)&pPSI)); // Create an IStream CHECKHR(CreateStreamOnHGlobal(NULL, TRUE, &pStm)); pStm->AddRef(); // // Create an xml document with a root element // ULONG ulWritten; CHECKHR(pStm->Write("", strlen(""), &ulWritten)); // load the xml document CHECKHR(pStm->Seek(li, STREAM_SEEK_SET, NULL)); CHECKHR(pPSI->Load(pStm)); // get root element CHECKHR(pDoc->get_root(&root)); // // Create an xml document in memory // for (i = 10; i > 0; i--) { enode = CreateXMLElement(pDoc, XMLELEMTYPE_ELEMENT); CHECKHR(root->addChild(enode, -1, -1)); for (j = 10; j > 0; j--) { el = CreateXMLElement(pDoc, XMLELEMTYPE_ELEMENT); CHECKHR(enode->addChild(el, -1, -1)); SAFERELEASE(el); } SAFERELEASE(enode); } CHECKHR(pStm->Seek(li, STREAM_SEEK_SET, NULL)); CHECKHR(pPSI->Save(pStm, TRUE)); // // Load the document from the in-memory stream // CHECKHR(pStm->Seek(li, STREAM_SEEK_SET, NULL)); CHECKHR(pPSI->Load(pStm)); Cleanup: SAFERELEASE(root); SAFERELEASE(el); SAFERELEASE(enode); SAFERELEASE(pPSI); SAFERELEASE(pStm); SAFERELEASE(pDoc); return hr; } int _cdecl main(int argc, char *argv[]) { HRESULT hr = S_OK; CoInitialize(NULL); hr = MemDocument(); CoUninitialize(); return hr == 0 ? 0 : 1; } xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Mon Nov 9 19:08:52 1998 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:06:22 2004 Subject: XML and IE5 beta PR2 References: <000001be0c0b$e89672a0$020110ac@office> Message-ID: <36473C9A.2E09D648@eng.sun.com> Paul Spencer wrote: > > Mind you, I don't think MS is correctly implementing the August 18 > [XSL] draft, but that is another matter, and I could be wrong. With "xsl:eval", "xsl:script" and many other proprietary extensions, yet without even "xsl:process-children", I think you're clearly right. W3C's XSL pattern syntax is there, but not a lot else. This would have been a good time to use the namespace mechanism, for example "msxsl:" for those extensions, and "xsl:" for W3C features. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Sung_Nguyen at datacard.com Mon Nov 9 20:03:33 1998 From: Sung_Nguyen at datacard.com (Sung Nguyen) Date: Mon Jun 7 17:06:22 2004 Subject: Commercial XML Parser in C++ Message-ID: <001573DA.3096@datacard.com> Hi: Anyone use a commercial XML Parser in C++ ??? I need one. Thanks alot, SeanN xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From smangat at Adobe.COM Tue Nov 10 01:48:28 1998 From: smangat at Adobe.COM (Satwinder Mangat) Date: Mon Jun 7 17:06:22 2004 Subject: WebDAV In-Reply-To: <0015722D.3096@datacard.com> Message-ID: <000f01be0c4b$62c87f20$8f9e2099@dsds3.corp.adobe.com> Hi, What is the website for WebDAV? Thanks Satwinder Mangat xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Tue Nov 10 02:06:12 1998 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:06:22 2004 Subject: WebDAV Message-ID: <002001be0c4e$2c0833b0$4a850786@ecn08.curtin.edu.au> -----Original Message----- From: Satwinder Mangat >What is the website for WebDAV? http://www.ics.uci.edu/~ejw/authoring/ James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From liamquin at interlog.com Tue Nov 10 04:42:33 1998 From: liamquin at interlog.com (Liam R. E. Quin) Date: Mon Jun 7 17:06:22 2004 Subject: Last Call issued on initial stylesheet linking draft In-Reply-To: <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> Message-ID: With respect to http://www.w3.org/TR/1998/WD-xml-stylesheet-19981001 ... This mail has (1) a general plea, followed by some numbered suggestions/corrections to the draft. It seems to me that the mechanism for linking between an XML document and a style sheet is exactly the same as linking between an XML document and its DTD or Schema, and exactly the same as linking to a namespace, or to a related image, or almost any other link. I think the processing instruction is unfortunate: XLink should be used. If XLink is not powerful enough to do this, fix it. If XLink is too complex to do something this simple, fix it. If XLink should be replaced by or (as I suspect) merged with RDF, fine, use RDF. Style and behaviour/Action sheets are in many ways very similar to the schema definitions one might expect to find at the other end of a Namespace URI, too. Tim (or the group for whom Tim posted) clearly realised the inadequacy of the processing instruction synrax, as the message was mostly an apology for using it. Please reconsider. If you go ahead with the draft as it stands, [1] please can you add an example of right-to-left text in a style sheet title, and show how to indicate the language? That isn't clear to me. [2] there is an implication in the productionPseudoAttValue[3] that general entity references to the five predefined internal general entities are to be expanded inside the text of processing instructions. This is not part of XML 1.0 behaviour (see Section 2.6 in XML 1.0, where the content of a processing instruction is simply [16] PI ::= '' Char*)))? '?>') with no allowance for entity references to be understood (although they are not forbidden). The "Associating Stylesheets with XML documents" draft needs to say whether this new treatment of processing instructions is 2a) not the case -- < is simply the 4 characters &, l, t, ; inside a processing instructihons, as now; 2b) undefined, as with the present draft (i think through oversight) 2c) optional depending on the processor 2d) required in PIs but forbidden, optional or undefined elsewhere 2e) required retroactively in all conforming XML software in all PIs 2f) some other interpretation I'm listing the obvious interpretations to show why it needs to be clarified. [3] please can we avoid the normative reference to html 4.0 in the first paragraph of section 1? It might make it harder for a future version of html to be based on xml, as you'd end up with a circular dependency :) I don't think this is a big deal, though, and I don't think the other references to html 4 matter as much, if at all. I'd prefer to see a note that pointed out the similarity, and then a duplication of the appropriate text, I think. [4] in production [2], the space should be S? I assume, as should the trailing one in production [1] -- none of the examples match the grammar as it stands. [5] can the style sheet have an associated language? Does that make sense?? [6] Finally, give an example of a Link: header to make the mapping clear. I hope these comments help. Lee -- Liam Quin, GroveWare Inc., Toronto; The barefoot programmer l i a m q u i n at i n t e r l o g dot c o m irc.technonet.net::ankle5 irc.dal.net::ankle{MD} xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Tue Nov 10 07:53:06 1998 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:06:22 2004 Subject: Full Disclosure: What The XML Processor Must Tell The XML Application Message-ID: <01BE0C86.DBD2AA40@grappa.ito.tu-darmstadt.de> Robert C. Lyons wrote: > List item #3 should be changed to the following (the changes are in bold): > > 3. An XML processor must pass the single character in place > of or appearing in its input. (2.11) Section 2.11 states, "To simplify tasks of applications, wherever an external parsed entity or the literal entity value of an internal parsed entity contains either the literal two-character sequence "#xD#xA" or a standalone literal #xD, an XML processor must pass to the application the single character #xA. (This behavior can conveniently be produced by normalizing all line breaks to #xA on input, before parsing.)" These two sentences seem to contradict each other. The first talks about internal and external parsed entities. The second talks about all line breaks, which seems to imply the document entity as well. As far as I can tell, the document entity is a parsed entity (this is never actually stated, but strongly implied and the only reasonable conclusion), but is neither internal nor external. I assume that the second sentence is the correct one, as this is the only reasonable behavior from the application's point of view. My point in bringing this up is not to throw a wrench into the works, but just to add this to the list of minor errata. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Tue Nov 10 09:32:29 1998 From: ht at cogsci.ed.ac.uk (Henry S. Thompson) Date: Mon Jun 7 17:06:22 2004 Subject: Commercial XML Parser in C++ In-Reply-To: Sung_Nguyen@datacard.com's message of "Mon, 9 Nov 1998 13:57:46 -0600" References: <001573DA.3096@datacard.com> Message-ID: See http://www.ltg.ed.ac.uk/software/xml/ for a description and licence terms for our C (not C++ as such) XML parser, API and toolkit. ht -- Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk URL: http://www.ltg.ed.ac.uk/~ht/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From anderst at toolsmiths.se Tue Nov 10 14:48:23 1998 From: anderst at toolsmiths.se (Anders W. Tell) Date: Mon Jun 7 17:06:22 2004 Subject: Endtag attributes Message-ID: <36474321.DE4EA5A3@toolsmiths.se> Hi, As a part of a small R&D project Im I trying to look into the issues of elements vs. attributes and having attributes in endtags (im know its outside current xml standard). But Im having trouble finding any work being presented around pros and cons with endtag attributes. Is there anyone that could point me to research /papers/discussion on the net ? This would help me enormously. Im especially interested in arguments for having endtag arguments. Thanks, Anders -- /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ / Financial Toolsmiths AB / / Anders W. Tell / /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From majayg at iitk.ac.in Tue Nov 10 15:53:46 1998 From: majayg at iitk.ac.in (Ajay Gangwar) Date: Mon Jun 7 17:06:22 2004 Subject: Converting HTML to well formed XML Message-ID: <002e01be0cc1$b93bad70$52a21090@cse.iitk.ernet.in> Self: >I need to convert HTML document to well formed XML >document. Can someone please guide me how to go >about doing this? Are any utilities available for this? Simon: >Unfortunately, no one yet (so far as I know) has created >a friendly one-step legacy HTML->well-formed XML >syntax HTML converter. That's a nice opportunity there for >some publicity, if not necessarily $$$... >Of course, if you want to to use non-HTML vocabulary, >you're talking about something much larger than >syntax, [...snip...] Actually what I was expecting when I posted that question, was some pointers from the 'gurus' on how to go about implementing such a convertor ( i.e a discussion on the issues involved) Help ;-( - Ajay Gangwar majayg@iitk.ac.in xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Tue Nov 10 15:59:05 1998 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:06:22 2004 Subject: Endtag attributes In-Reply-To: <36474321.DE4EA5A3@toolsmiths.se> References: <36474321.DE4EA5A3@toolsmiths.se> Message-ID: * Anders W. Tell | | As a part of a small R&D project Im I trying to look into the issues | of elements vs. attributes and having attributes in endtags | (im know its outside current xml standard). | | Is there anyone that could point me to research /papers/discussion on | the net ? I doubt that there is anything, but if there is you can find it at Some cons I thought of: - additional complexity in APIs and syntaxes, or, if these are kept as they are, in parsers - endtag attributes are not available until the entire element has been parsed (and it may be _very_ large), so that if the values influence the treatment of the element contents in any way, this may cause performance problems The only pro I can think of is pretty far-fetched: - might be easier to generate markup from some kinds of applications, since attribute values would not have to be determined until after the element contents had been written out --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Tue Nov 10 16:24:55 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:06:22 2004 Subject: Converting HTML to well formed XML Message-ID: <3.0.32.19981110082051.00b3a5d0@pop.intergate.bc.ca> At 09:19 PM 11/10/98 +0530, Ajay Gangwar wrote: >Self: >>I need to convert HTML document to well formed XML >>document. Can someone please guide me how to go >>about doing this? Are any utilities available for this? There are a variety of HTML parsers in perl available on CPAN. Once you've parsed some HTML, emitting a WF XML version ought to be trivial. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Steven_DeRose at Brown.edu Tue Nov 10 16:30:45 1998 From: Steven_DeRose at Brown.edu (Steve DeRose) Date: Mon Jun 7 17:06:22 2004 Subject: Endtag attributes In-Reply-To: <36474321.DE4EA5A3@toolsmiths.se> Message-ID: At 3:31 PM -0400 11/9/98, Anders W. Tell wrote: > >As a part of a small R&D project Im I trying to look into the issues >of elements vs. attributes and having attributes in endtags >(im know its outside current xml standard). > >But Im having trouble finding any work being presented around >pros and cons with endtag attributes. > >Is there anyone that could point me to research /papers/discussion on >the net ? >This would help me enormously. > >Im especially interested in arguments for having endtag arguments. There is a detailed discussion of the semantics and design of attributes vs. elements in my book "The SGML FAQ Book", that you may find helpful (along with discussions of nearly every other controversial question about SGML and XML). Common practice is that attributes represent information relevant to an element as a whole -- not to a tag. And in XML, tags aren't commonly considered first-class objects: they're a syntactic way of labelling the actual objects, namely elements; so giving them their own properties seems to me a bit out of keeping with the model. Given that, it would not fit to have different attributes on the end-tag than on the start-tag. On the other hand, having the same attributes on both ends would be redundant. One argument I have heard for them was that if you had them you might be able to parse backwards. But I have never found a good occasion to do that, and if in fact you can parse XML backwards otherwise, it hardly matters whether the parser finds the attributes first or last anyway. If you allow them, you have additional semantics to define (the syntax is obviously trivial), like what it means if you have the same name (with the same or different values) on both ends of an element, etc. etc. I don't know of an analogous precedent in other languages either (though perhaps there's one out there somewhere). Thus, I think you'll find little discussion in the literature. I've seen them show up in HTML as typos. Perhaps you could describe specifically what it is you think they would be useful for..... Steven_DeRose@Brown.edu; http://www.stg.brown.edu/~sjd Visiting Chief Scientist, Scholarly Technology Group, and Adjunct Associate Professor, Brown University; Chief Scientist, Inso Corporation xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Tue Nov 10 16:59:25 1998 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:06:22 2004 Subject: Endtag attributes In-Reply-To: <36474321.DE4EA5A3@toolsmiths.se> Message-ID: <000201be0ccb$abd35200$73e987cb@NT.JELLIFFE.COM.AU> > From: Anders W. Tell > Im especially interested in arguments for having endtag arguments. One good reason pro is because when you are doing text processing using a streaming tool (i.e. one that does not load the document in memory) sometimes you wish to write out attributes with values collected from processing sub elements. Consider an application which reads a table and adds an attribute with the column and row count. This is a kind of foward-reference problem. In streaming text processing tools you: * have two passes, or * write out the values of all the attributes in an external entity file (so that the XML parser resolves the references on next document load), or * have a built-in reference resolving phase such as OmniMakr uses. Until recently (and for all I know, maybe still) most text processing of non-HTML SGML/XML marked-up files was done using streaming tools (i.e. with event programming). The trouble with having end-tag attributes for this kind of use is that even though it would be easier to generate such documents, an end-tag attribute solution does not buy you anything, because if you are using streaming tools you will still need the "collected value" attributes when you come to the start-tag, in order to set up processing of the rest of the element. You still need some step to move the end-tag attribute to the start-tag. And, as has been mentioned, maybe people who are used to tag-based text processing rather than element (i.e. range) based processing will not be encouraged to alter their thinking, if end-tag attributes were allowed. Perhaps the one thing that end-tag attributes might be useful for is checksums, though. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From anderst at toolsmiths.se Tue Nov 10 17:04:51 1998 From: anderst at toolsmiths.se (Anders W. Tell) Date: Mon Jun 7 17:06:22 2004 Subject: Endtag attributes References: Message-ID: <364762DE.1AC6B1C7@toolsmiths.se> Steve DeRose wrote: > At 3:31 PM -0400 11/9/98, Anders W. Tell wrote: > > > > I've seen them show up in HTML as typos. Perhaps you could describe > specifically what it is you think they would be useful for..... The UseCase that interests me is the class of attributes which cannot be created without accessing all children, such as summation/aggregates. This may be important when accessing xml data through Stream interfaces such as SAX. It may prove inefficient to process a large section of the stream before actually writing the attribute to the next stream in a chain of streams. So adding an attributes at the end allows the developers to keep a pure stream programming model without sacrificing performance. One of the most interesting attributes is "Signature" which cannot be calculated (write) before the sub fragment has been processed. It s also impossible to validate (read) a Signature before the endtag arrives. /anders -- /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ / Financial Toolsmiths AB / / Anders W. Tell / /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From anderst at toolsmiths.se Tue Nov 10 17:16:13 1998 From: anderst at toolsmiths.se (Anders W. Tell) Date: Mon Jun 7 17:06:23 2004 Subject: Endtag attributes References: <000201be0ccb$abd35200$73e987cb@NT.JELLIFFE.COM.AU> Message-ID: <364765C0.37BEC7F3@toolsmiths.se> Rick Jelliffe wrote: > > From: Anders W. Tell > > > Im especially interested in arguments for having endtag arguments. > > One good reason pro is because when you are doing text processing using a > streaming tool (i.e. one that does not load the document in memory) > sometimes you wish to write out attributes with values collected from > processing sub elements. > > Consider an application which reads a table and adds an attribute with the > column and row count. > > This is a kind of foward-reference problem. In streaming text processing > tools you: > > * have two passes, or > * write out the values of all the attributes in an external entity file (so > that the XML parser resolves the references on next document load), or > * have a built-in reference resolving phase such as OmniMakr uses. > > Until recently (and for all I know, maybe still) most text processing of > non-HTML SGML/XML marked-up files was done using streaming tools (i.e. with > event programming). Is possible to save parts of the document im memory before writing it to the next stream in a chain. This may not be desirable in cases when documents are large . > The trouble with having end-tag attributes for this kind of use is that even > though it would be easier to generate such documents, an end-tag attribute > solution does not buy you anything, because if you are using streaming tools > you will still need the "collected value" attributes when you come to the > start-tag, in order to set up processing of the rest of the element. You > still need some step to move the end-tag attribute to the start-tag. Not all attributes , which are calculated after processing a fragment, are usable before the all other attributes and children have been processed. An example of this type of attributes is "Signature" which can may be read when the start-tag token arrives but the validation of the signature cannot be perfomed until the endtag arrives. However I agree with you that most attributes dont add anything when added in the end-tag. > And, as has been mentioned, maybe people who are used to tag-based text > processing rather than element (i.e. range) based processing will not be > encouraged to alter their thinking, if end-tag attributes were allowed. True, > > Perhaps the one thing that end-tag attributes might be useful for is > checksums, though. And its alter ego Signatures :) /Anders /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ / Financial Toolsmiths AB / / Anders W. Tell / /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nwoh at software-ag.de Tue Nov 10 17:42:47 1998 From: nwoh at software-ag.de (Nigel Hutchison) Date: Mon Jun 7 17:06:23 2004 Subject: XML and Internationalization... In-Reply-To: <1301506998-42478233@tallent.com> Message-ID: <3.0.6.32.19981110184224.00938920@pophost.software-ag.de> I think one of the issues in the internationalisation of SGML/XML/HTML is as follows. Internationalising a document means creating it so that subsequently is very convenient to be localised. That implies if I give my SGML product documentation to a professional translators to be translated into German say, how do they know which parts to be translated and which to be left alone? for example if one of them finds
Then the translator at least knows what he has to do. Nigel W. O. Hutchison Technical Consultant Software AG Germany mailto:nwoh@software-ag.de Tel +49 (0)6151 92 1207 * xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Tue Nov 10 18:51:15 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:06:23 2004 Subject: XLink - where are we? [tiny amount of frustration] In-Reply-To: References: <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> Message-ID: <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> As moderator I try to keep neutral on the direction that the W3C takes, so please excuse a smallish bit of frustration about the non-development and non-use of XLink. If you think I'm out of line, I'll shut up. At 23:42 09/11/98 -0500, Liam R. E. Quin wrote: > >I think the processing instruction is unfortunate: XLink should >be used. If XLink is not powerful enough to do this, fix it. >If XLink is too complex to do something this simple, fix it. I share this sentiment in general. I think XLink is one of the most exciting things about the family of X*L. I think we are suffering too much complexity, and foozle **simply because XLink STILL isn't ready**. I have designed my DTDs (CML and VHG) with the idea that there will be a linking mechanism similar to the current XLink. XLink (as XML-Link) was first announced about 18 months ago as far as I remember and it was certainly reasonable to expect that it would be in common use by now. The fact that it is not and we have no indication of timescales has - I suspect - disillusioned many people. I still keep the faith, but it's hard, especially when there are virtually no engines even in prototype that we can play with. In passing I think that a serious drawback of the W3C's approach is that there is no incentive for anyone to experiment in public view. *if* we had been experimenting with XLink, namespaces and the rest then we would have a lot more useful experience to go on by now. In fact I think innovation and exploration in XML is suffering in comparison with the development of HTML. "Plan to throw the first one away - you will anyway" (Fred Brooks). I wish there was more encouragement for the enthusiasts to develop the first one. I acknowledge that this somewhat frustrating plea may spring from the fact of my no longer having access to the W3C deliberations and - in retrospect - I expend my sympathy to the brave souls on this list who have always been in this position. It can look very bleak from the outside. The *perceived* messages [i.e. the electronic body language] are: - it's not worth non-W3C members trying to get involved. [If we had taken this view we wouldn't have SAX and we wouldn't have XSchema]. - it's not worth minnows (like the Xschema group, me, etc.) trying to do anything because MS/NS/XYZPQR/W3C are going to do it anyway. [Same comment - the movement on XML-type, XML-data, etc. has been almost retrograde.] - XML is only for companies and individuals shouldn't really be involved. > >If XLink should be replaced by or (as I suspect) merged with RDF, >fine, use RDF. This appals me. Is it true? likely? Because if so, Xlink will be (a) delayed another year or so (b) end up as so complicated that it will be unusable. I continue to hold that RDF is too complicated for most people and will reduce rather than encourage the use of metadata a linking. [Look how apparently simple namespaces are and how slow the development appears to be other than to identify the vendor of the software - hardly a semantic revolution.] Having slightly flamed off, I would like the replies to be constructive - not "what is wrong", but "can this be got moving again in practice"?. One thing at least is that XML-DEV can move reasonably quickly. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Tue Nov 10 18:51:17 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:06:23 2004 Subject: XSchema Message-ID: <3.0.1.16.19981110195101.09ef254a@pop3.demon.co.uk> I have spent the last week or so hacking XSchema into JUMBO and testing it out. Firstly to say many thanks to those who have contributed and especially to Ron Bourret, Simon StL and John Cowan [I hope I haven't missed anyone] for driving this through. It is exactly what is wanted at present. The mapping onto the DTD is precise and essential for me. Without it I suspect that it would be more difficult to implement correctly. I have fitted it to JUMBO with the functions: - offering a list of potential child elements for a given element. [This doesn't yet do on-the-fly content validation and I doubt it will. It gives a JComboBox of the possible children. If someone has a Model validation routine I can fit it.] This allows XSchema-drive editing of element names and structure. - offering a table of attributes with the various options (Required, Value, etc.) driving the editability of various fields. By implication, of course, the editor can be used to edit XSchemas. I have used RonB's DTD2XSchema translator - works fine, but I think Ron said it needed slight tweaks. ** Is there a way of capturing the external DTD subset in SAX, or AElfred, or any other parser? It really only needs to capture the URL? Do I have to subclass an Entity handler? In this way it would be possible to read in the DTD automatically, transform to XSchema and display.] ** ** Is there any way to extract the complete "DTD" - i.e. external and internal subsets (but without parameter/general entities) from SAX/AElfred/XYZPQR? ** I hope to release the XSchema-aware JUMBO (i.e. a structure/attribute-aware XML editor) very shortly so people can verify if I have got the mechanics right and whether this is something useful. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Tue Nov 10 19:14:20 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:06:23 2004 Subject: Software for markup? (was Re: XML Search Engine) In-Reply-To: <13889.62092.676344.644070@localhost.localdomain> References: <3.0.32.19981105094725.009209b0@pop.intergate.bc.ca> <3.0.32.19981105094725.009209b0@pop.intergate.bc.ca> Message-ID: <3.0.1.16.19981110201402.1bef8c00@pop3.demon.co.uk> I have what (I hope) is an easier problem and would be grateful for pointers to existing software that can be used to mark up XML documents [primarily in English]. We have developed an XML representation (VHG) for controlled vocabularies which are widely used for "encoding" reports and other information. These vocabularies have about 10K terms each, consisting of single words or phrases (e.g. "acne", "Asian cholera", etc.. [Our current examples are dictionaries of disease terms from world authorities.] The [pharmaceutical] industry/regulators spend much time in "encoding" (== markup). In general the dictionaries do not have indexHeadings, stemming, pronunciations, US/Eur variants etc. (though we are hoping to promote the communal capture of such knowledge through WWW-based collaboration). We wish to be able to markup such terms as automatically as possible, e.g.

The patient developed acne

could be transformed to

The patient developed acne

The current encoding procedures are not complex (often manual) and I'm not looking for rocket science (e.g. no automatic extraction of terms, concept analysis, etc.) I want to avoid re-inventing the wheel because I assume lots have people have already done this. TIA P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Tue Nov 10 19:16:45 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:06:23 2004 Subject: XLink - where are we? [tiny amount of frustration] Message-ID: <3.0.32.19981110111212.00b43950@pop.intergate.bc.ca> At 06:58 PM 11/10/98, Peter Murray-Rust wrote: >I share this sentiment in general. I think XLink is one of the most >exciting things about the family of X*L. I think we are suffering too much >complexity, and foozle **simply because XLink STILL isn't ready**. The good news is that there has been, since August, an XML Linking WG. The bad news is that so far it hasn't done much (the old story, resource constraints and people being real busy). I share Peter's opinion that XLink will be the next high-impact thing thing to come out of the XML process, and that this should happen sooner rather than later. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ralph at fsc.fujitsu.com Tue Nov 10 20:11:53 1998 From: ralph at fsc.fujitsu.com (Ralph Ferris) Date: Mon Jun 7 17:06:23 2004 Subject: XLink - where are we? [tiny amount of frustration] In-Reply-To: <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> References: <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> Message-ID: <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> At 06:58 PM 11/10/98, Peter Murray-Rust wrote: >I have designed my DTDs (CML and VHG) with the idea that there will be a >linking mechanism similar to the current XLink. XLink (as XML-Link) was >first announced about 18 months ago as far as I remember and it was >certainly reasonable to expect that it would be in common use by now. The >fact that it is not and we have no indication of timescales has - I suspect >- disillusioned many people. I still keep the faith, but it's hard, >especially when there are virtually no engines even in prototype that we >can play with. Apparently, my announcement of HyBrick 0.8 on this list - and several others - has been completely overlooked by yourself and the other people holding this discussion. HyBrick 0.8 is a working implementation of XLink/XPointer available *now*. You can download the English-language menu version from our site at FSC: http://collie.fujitsu.com/HyBrick/. Japanese-language readers should go to http://www.fujitsu.co.jp/hypertext/free/HyBrick/download2.html Best regards, Ralph E. Ferris Fujitsu Software Corporation xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From travis at betty.cnidr.org Tue Nov 10 21:55:46 1998 From: travis at betty.cnidr.org (K. Travis Walsh) Date: Mon Jun 7 17:06:23 2004 Subject: Oracle XML Message-ID: Oracle officially anounced today it's XML strategy. For those of you who didn't know there is a java parser built into this new release of Oracle 8i. They also made thier XML web site live: www.oracle.com/xml So you can see what they have now, and what they are going to have in the future. The initiative to have XML support came from inside developers looking for solutions to internal problems, so it's an honest effort. Not just a me too patch. Wonderful things are happening! ********************** * K. Travis Walsh * * MCNC / CNIDR * * Programmer / Analyst * * travis@mcnc.org * ********************** My Quote of the day: I took a speed-reading course and read War and Peace in twenty minutes. It involves Russia. -Woody Allen xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Tue Nov 10 22:57:21 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:06:23 2004 Subject: XLink - where are we? [tiny amount of frustration] In-Reply-To: <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> References: <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> Message-ID: <3.0.1.16.19981110235158.1b479670@pop3.demon.co.uk> At 03:07 11/11/98 -0500, Ralph Ferris wrote: > >Apparently, my announcement of HyBrick 0.8 on this list - and several >others - has been completely overlooked by yourself and the other people >holding this discussion. HyBrick 0.8 is a working implementation of No - I was aware of the work that you and others have been doing. And many thanks for it. I was also aware that there is a reconstituted (== revitalised??) XLink-WG. What sparked me off was the implication - real or not - that XLink might undergo yet another redraft/latency period. And also - unless I'm out of touch - the absence of documents to work with. We have a vicious circle here which we have to try and break: - no authoring tools ==> no documents - no documents ==> no XML-aware tools - no XML-aware tools ==> no real *interoperable* applications - no applications ==> no drive for authoring tools My concern is that there are a small number of us - including yourselves - that are tackling small parts of the problem. We need ways of bringing this together. SAX, Xschema, various companies, and others including myself have offered code as OpenSource - this helps. But we don't have critical mass. **In particular we are not exploring interoperability**. By this I mean that person A sends a document to person B in a different organisation and B tries to work with the document. Not easy. But that's what our real aim is, surely? [Documents can, of course include data and other things like molecules, music, vector graphics, etc.] Until we do that we aren't exploring and innovating in XML. We are simply looking at ways of passing pseudo-paper over the wire. As I have bemoaned before, virtually no-one is yet passing XML over the wire. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Nov 10 23:54:28 1998 From: david at megginson.com (david@megginson.com) Date: Mon Jun 7 17:06:23 2004 Subject: Endtag attributes In-Reply-To: <364762DE.1AC6B1C7@toolsmiths.se> References: <364762DE.1AC6B1C7@toolsmiths.se> Message-ID: <13896.53527.214735.361579@localhost.localdomain> Anders W. Tell writes: > So adding an attributes at the end allows the developers to keep a > pure stream programming model without sacrificing performance. Yes, but what you save in the writing, you lose in the reading: a SAX client reading your XML element would have to buffer the whole contents until it got to the end tag to make certain that there were no important attributes there. A specific version of a specific XML document is written only once, but it can be read many times; when you are forced to choose between adding complexity to writing or reading, writing should end up drawing the shorter straw. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ralph at fsc.fujitsu.com Wed Nov 11 01:04:52 1998 From: ralph at fsc.fujitsu.com (Ralph Ferris) Date: Mon Jun 7 17:06:23 2004 Subject: XLink - where are we? [tiny amount of frustration] In-Reply-To: <3.0.1.16.19981110235158.1b479670@pop3.demon.co.uk> References: <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> Message-ID: <3.0.5.32.19981111080054.00a0cae0@pophost.fsc.fujitsu.com> At 11:51 PM 11/10/98, Peter Murray-Rust wrote: > >What sparked me off was the implication - real or not - that XLink might >undergo yet another redraft/latency period. All the more reason to mention HyBrick. We're not starting from square one. We have a working implementation that should be seen as a baseline. I have, in fact, just sent the "reconstituted" WG a very pointed message about *not* ignoring Fujitsu and Hybrick. I have also volunteered to be an editor for the XPointer spec., in which capacity I - as Fujitsu's representative - will oppose any attempts to take this subject off into the wild blue yonder. > And also - unless I'm out of >touch - the absence of documents to work with. I have been thinking of posting a message with a "request for content." These would be documents that I would include with the HyBrick distribution. Of course, these doucments have to have publically-distributable DTDs and DSSSL stylesheets. Unfortunately, though, the "traditional" DSSSL folks seem bent on ignoring HyBrick as well. Current users write messages to the DSSSL list asking for advice on how to output RTF from Jade, while the developers have abandoned DSSSL altogether for the siren-song of XSL. > >We have a vicious circle here which we have to try and break: > - no authoring tools ==> no documents In my view, the abandonment of DSSSL by the very experts who created it is at least partially responsible for this. Last year at at SGML/XML '97, we demonstrated early work on a GUI style sheet editor for DSSSL. We showed it to some of the WKLs (Well-Known Luminaries), who said "looks cool" - and went off to do XSL. The problem is, what "ecological niche" does XSL occupy? For the basics - and a lot more - you can go the XML transform to HTML/CSS route. DSSSL supports power users. In fact, it supports basic users maybe as well as any other method. It just that users were told they need to know Scheme to use DSSSL, which isn't true. What is needed for style sheets is a good GUI tool. Instead, the WKLs preferred the more intellectually interesting job of defining yet another syntax. Given this context, we haven't presued work on the DSSSL editor. > As I have bemoaned before, virtually no-one is > yet passing XML over the wire. To keep this message relatively short, I'll just say that HyBrick does support retrieval of XML documents over the Web; some folks have been able to retrieve the sample document by Eliot Kimber that's on our Web site. SP, which HyBrick runs on, doesn't support the use of proxy servers, however, so the document is inaccessible to many others. We're looking into this, with an eye to supporting proxy servers in the next release. That doesn't address the general issue that your raising of course. Hopefully, though, it will be a start. Best regards, Ralph E. Ferris Fujitsu Software Corporation xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Nov 11 01:15:18 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:06:23 2004 Subject: XLink - where are we? [tiny amount of frustration] In-Reply-To: <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> References: <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> Message-ID: <3.0.1.16.19981111021455.6f2f8962@pop3.demon.co.uk> At 03:07 11/11/98 -0500, Ralph Ferris wrote: > >Apparently, my announcement of HyBrick 0.8 on this list - and several >others - has been completely overlooked by yourself and the other people >holding this discussion. HyBrick 0.8 is a working implementation of >XLink/XPointer available *now*. You can download the English-language menu >version from our site at FSC: http://collie.fujitsu.com/HyBrick/. It is - as far as I know - the most advanced XLink implementation. I have some problems with Hybrick and I'd be grateful from clarification on certain points. These aren't meant to detract from what has been done. (a) it appears to be primarily an SGML tool rather than an XML one. The DTDs shipped with the distribution all have minimisation controls (or whatever the "- o" is called). When I try to run any of my XML files it throws errors on the files. (b) the XML files appear to reference these SGML (non-XML) DTDs so although it may be valid SGML it doesn't seem to be valid XML. I suspect it may need a catalog since one is shipped with it. (c) it appears to require all applications to have a stylesheet - i.e. there is no default display and it's therefore primarily for rendering XML as text with hyperlinks. (d) the license restricts use to evaluation only which means it is not really usable for experimentation - especially without an API, etc. For example, I cannot demonstrate it to students. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Wed Nov 11 02:43:26 1998 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:06:23 2004 Subject: Converting HTML to well formed XML Message-ID: <000c01be0d1c$df7f7470$2ee044c6@arcot-main> >Unfortunately, no one yet (so far as I know) has created >a friendly one-step legacy HTML->well-formed XML >syntax HTML converter. That's a nice opportunity there for >some publicity, if not necessarily $$$... Docuverse HTML SDK can be used to build such a converter easily. What it contains is a SAX parser interface to Swing's HTML parser which means you can use your XML tools on HTML documents. However, Swing's HTML parser has mishandles unknown tags so it is not a perfect solution. You can find HTML SDK at http://www.docuverse.com/htmlsdk Best, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jswleung at hkbu.edu.hk Wed Nov 11 03:11:43 1998 From: jswleung at hkbu.edu.hk (Josef Siu-wai Leung) Date: Mon Jun 7 17:06:23 2004 Subject: Embedding figures Message-ID: Hi, I would like to learn about how to show images into any bottom level elements of XML in IE5 Beta. I think I need to add an element "Figure*" (which indicates a filename) into every original bottom level element. Would you please share with me a working XML/XSL example. Cheers, Josef xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ralph at fsc.fujitsu.com Wed Nov 11 03:14:33 1998 From: ralph at fsc.fujitsu.com (Ralph Ferris) Date: Mon Jun 7 17:06:24 2004 Subject: XLink - where are we? [tiny amount of frustration] In-Reply-To: <3.0.1.16.19981111021455.6f2f8962@pop3.demon.co.uk> References: <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> Message-ID: <3.0.5.32.19981111100820.00a00630@pophost.fsc.fujitsu.com> At 02:14 AM 11/11/98, Peter Murray-Rust wrote: >(a) it [HyBrick] appears to be primarily an SGML tool rather than an XML > one. The >DTDs shipped with the distribution all have minimisation controls (or >whatever the "- o" is called). When I try to run any of my XML files it >throws errors on the files. HyBrick runs on SP and Jade, which support "full" SGML. To work within the syntax defined by the XML spec., the SGML declaration for XML has to be called. At the moment, this has to be done explicitly. If you look in the XML-sample directory, you'll see that the catalog file contains the entry: SGMLDECL "..\dtd\xml.dcl" In a future release, this will be done "automatically" when the XML processing instruction is encountered. > >(b) the XML files appear to reference these SGML (non-XML) DTDs so although >it may be valid SGML it doesn't seem to be valid XML. I suspect it may need >a catalog since one is shipped with it. See previous comment. > >(c) it appears to require all applications to have a stylesheet - i.e. >there is no default display and it's therefore primarily for rendering XML >as text with hyperlinks. Yes - a DSSSL style sheet must be provided. > >(d) the license restricts use to evaluation only which means it is not >really usable for experimentation - especially without an API, etc. For >example, I cannot demonstrate it to students. I'm not sure what you mean here. You're right that it's a binary distribution, which does not expose an API for developers. I would think though that what you would be trying to demonstrate to students would be functionality, and you can certainly do that. Best regards, Ralph E. Ferris Fujitsu Software Corporation xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Wed Nov 11 03:24:10 1998 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:06:24 2004 Subject: XLink - where are we? [tiny amount of frustration] References: <3.0.32.19981110111212.00b43950@pop.intergate.bc.ca> Message-ID: <364902AE.34A3@hiwaay.net> Tim Bray wrote: > > The good news is that there has been, since August, an XML Linking > WG. The bad news is that so far it hasn't done much (the old story, > resource constraints and people being real busy). I share Peter's > opinion that XLink will be the next high-impact thing thing to come > out of the XML process, and that this should happen sooner rather > than later. -Tim Me too. When advocating XML in other communities such as VRML, XLink is cited as one of the advantages of XML. Not being part of the deliberations, I am uncertain as to what the delays are given that the original drafts were promising and on the surface, quite clear. For the work I am doing, an XLink model would be useful if for nothing else, the properties, just as the RDF Dublin Core has been and XSchema may be. One benefit of suddenly being forced to do everything relationally is the insight into some of the designs floating about. So, please: 1. A standard for primitive/core datatypes 2. XLinks ASAP, len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Wed Nov 11 03:46:25 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:06:24 2004 Subject: Oracle XML References: Message-ID: <3649088C.3EADA703@infinet.com> "K. Travis Walsh" wrote: > Oracle officially anounced today it's XML strategy. > > For those of you who didn't know there is a java parser built into this > new release of Oracle 8i. > > They also made thier XML web site live: > > www.oracle.com/xml > > So you can see what they have now, and what they are going to have in the > future. > > The initiative to have XML support came from inside developers looking for > solutions to internal problems, so it's an honest effort. Not just a me > too patch. > > Wonderful things are happening! It is nice to see that they are supporting SAX as well. I think the question now with XML tools in regard to SAX is "who does not support SAX"? Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cathy at bd748.pku.edu.cn Wed Nov 11 03:52:26 1998 From: cathy at bd748.pku.edu.cn (Chang Ming) Date: Mon Jun 7 17:06:24 2004 Subject: Web pages in non-Roman scripts Message-ID: <002201be0d26$d58da220$1c7669a2@cathy.bd748.pku.edu.cn> Hi, With this mail, i attach a small chinese xml file and its css file, all in utf-8. When i display this xml in IE5 Beta 2, it does not recognize the css. The document is displayed in plain text style, and all start tags is left displaying with content. I have tried coding CSS in GB-2312, still it gives no effect.I will try other encoding later. I want to know what kind of encoding should CSS use? Is there a need to add an encoding attribute in the style sheet PI? Another question: if i want use Chinese in markup, is UNICODE the only choice? Thank you, Chang Ming Computer Sci. and Tech. Institute of Peking University begin 666 doc.xml M/#]X;6P@=F5R:EYIR?("@C4$-$051!*3X-"CPA14Q%345.5"#FKK4@*"-00T1!5$%\ MY86SZ92NY:V7*2H^#0H\(45,14U%3E0@Y86SZ92NY:V7("@C4$-$051!*3X- M"ET^#0H\/WAM;#IS='EL97-H965T('1Y<&4](G1E>'0O8W-S(B!H:'AXY;FT>'CFG(AX>.:7I3POYI>EYIR?/@T*/.:NM3[GK*SDN(#FKK4\Y86S MZ92NY:V7/N6%L^F4KN6MESPOY86SZ92NY:V7/N>LK.2X@.:NM>>LK.2X@.:N MM>>LK.2X@.:NM>>LK.2X@.:NM>>LK.2X@.:NM>>LK.2X@.:NM>>LK.2X@.:N MM>>LK.2X@.:NM>>LK.2X@.:NM>>LK.2X@.:NM>>LK.2X@.:NM>>LK.2X@.:N MM2X\+^:NM3X-"CSFKK4^YZRLY+J,YJZUYZRLY+J,YJZUYZRLY+J,YJZUYZRL MY+J,YJZUYZRLY+J,YJZUYZRLY+J,YJZUYZRLY+J,YJZUYZRLY+J,YJZUYZRL MY+J,YJZUYZRLY+J,YJZUYZRLY+J,YJZUYZRLY+J,YJZUYZRLY+J,YJZUYZRL MY+J,YJZUYZRLY+J,YJZUYZRLY+J,YJZUYZRLY+J,YJZUYZRLY+J,YJZUYZRL 7Y+J,YJZU/"_FKK4^#0H\+^:6A^:AHSX` ` end begin 666 doc.css M#0KFH(?IHI@@>R!C;VQO2 Z(.F:MN2YICL-"@D)"71E>'0M86QI9VX@.B!C M96YT97([( T*"0D)9F]N="UW96EG:'0@.B!B;VQD.PT*"0D)9&ES<&QA>3H@ M8FQO8VL[?0T*#0KEK9#FH(?IHI@@>R!F;VYT+7-I>F4@.B Q-'!T.PT*"0ET M97AT+6%L:6=N(#H@F4@.B Q,G!T.PT*"0EF;VYT+7=E:6=H=" Z(&QI M9VAT97([#0H)"71E>'0M:6YD96YT(#H@,C1P=#L-"@D)9&ES<&QA>3H@8FQO M8VL@?0T*#0KFEZ7FG)\@>PT*"0EF;VYT+69A;6EL>3H@Z;N1Y+V3.PT*"0EF M;VYT+7-I>F4Z(#$P<'0[#0H)"6-O;&]R.B!B;'5E.R -"@D)9&ES<&QA>3H@ M8FQO8VM]"0T*#0KEA;/IE*[EK9<@>PT*"0EF;VYT+69A;6EL>3H@YJ6WY+V3 M7T=",C,Q,CL-"@D)9F]N="UW96EG:'0Z(&)O;&0[#0H)"69O;G0M References: <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> Message-ID: <3.0.5.32.19981110221631.008ec900@amati.techno.com> At 02:14 AM 11/11/98, Peter Murray-Rust wrote: >(c) it appears to require all applications to have a stylesheet - i.e. >there is no default display and it's therefore primarily for rendering XML >as text with hyperlinks. Not trying to be difficult, but how could any useful form of default presentation other than running everything together or making every element a separate block be provided by a DTD-inspecific tool? I don't know of any generalized SGML tool that provides any useful sort of default styling. The best is Arbortext's Document Architect, which provides a sort of wizard by which it asks a bunch of questions and makes some informed guesses and then generates a style sheet that can be a reasonable starting place (it can also be more trouble than it's worth depending on what sort of DTD you happen to have). Cheers, E. --
W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Suli.Ding at geis.ge.com Wed Nov 11 04:40:46 1998 From: Suli.Ding at geis.ge.com (Ding, Suli (GEIS)) Date: Mon Jun 7 17:06:24 2004 Subject: converting document to XML Message-ID: The UNIX version of the "doc2xml" are available from http://www.geocities.com/SiliconValley/Platform/4871/ A new command line option "-e" has been added to not produce empty element. For example, will not be produced to the output file. Another command line option "-o" can be specified for the output file name. For example doc2xml -tedi.tbl -e -oXML.out isa.edi will produce the XML document in file "XML.out". Regards, Suli Ding xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From suhas at india.tek.com Wed Nov 11 08:43:50 1998 From: suhas at india.tek.com (Suhas Joshi) Date: Mon Jun 7 17:06:24 2004 Subject: embedding "behviour" in XML document Message-ID: <36494BD2.79666333@kiwi.india.tek.com> Hi , Could anyone tell how a complete object (including *behaviour*) be sent /recieved using XML . Any pointers will help (I am a newbie to xml) Thanks & Regards, Suhas. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From anderst at toolsmiths.se Wed Nov 11 08:55:01 1998 From: anderst at toolsmiths.se (Anders W. Tell) Date: Mon Jun 7 17:06:24 2004 Subject: Endtag attributes References: <364762DE.1AC6B1C7@toolsmiths.se> <13896.53527.214735.361579@localhost.localdomain> Message-ID: <364841D2.617349EF@toolsmiths.se> david@megginson.com wrote: > Anders W. Tell writes: > > > So adding an attributes at the end allows the developers to keep a > > pure stream programming model without sacrificing performance. > > Yes, but what you save in the writing, you lose in the reading: a SAX > client reading your XML element would have to buffer the whole > contents until it got to the end tag to make certain that there were > no important attributes there. Yes and No, Since end-tag attributes is an extension to current xml the above problem may be offset by adding an end-tag marker to attributes in the DTD, DCD, XSchema's. Which means the applications know that the specific attribute may/must occur in the end-tag. > > A specific version of a specific XML document is written only once, > but it can be read many times; when you are forced to choose between > adding complexity to writing or reading, writing should end up drawing > the shorter straw. Depends on the UseCase in question, certainly on the internet there is probably a 80-20 (read-write) balance for *static* documents but more and more application/servers generates documents or fragments on demand with no possibility to cache. An example of this are dynamic queries. Another UseCase is InterProcessCommunication such as WebRPC. Here *documents* are generated on the fly and discarded on reception and caching many documents/requests/replies may not be a viable solution. As far as complexity goes, the only change is to add StartTag APIs to EndTag's. Im not proposing to add end-tag attributes, just trying to understand the implications if the feature was available.. Best, /anders -- /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ / Financial Toolsmiths AB / / Anders W. Tell / /_/_/_/_/_/_/_/_/_/_/_/_/_/_/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Wed Nov 11 12:42:21 1998 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:06:24 2004 Subject: XSchema Message-ID: <01BE0D78.74BCA7F0@grappa.ito.tu-darmstadt.de> Peter Murray-Rust wrote: > I have fitted it to JUMBO with the functions: > - offering a list of potential child elements for a given element. [This > doesn't yet do on-the-fly content validation and I doubt it will. It gives > a JComboBox of the possible children. If someone has a Model validation > routine I can fit it.] This allows XSchema-drive editing of element names > and structure. If I ever find time to write a validation module, on-the-fly validation should be a freebie. I figure the validator will be a SAX parser filter initialized with an XSchema document. The application (e.g. Jumbo) hooks this to a real parser, registers an ErrorHandler, and passes the Stream of XML to be validated; the return value from parse(), along with any values sent to the ErrorHandler, tell the application whether validation succeeded or not. > I have used RonB's DTD2XSchema translator - works fine, but I think Ron > said it needed slight tweaks. Still needs updating to the latest spec. Works starts this weekend. > ** Is there a way of capturing the external DTD subset in SAX, or AElfred, > or any other parser? It really only needs to capture the URL? Do I have to > subclass an Entity handler? In this way it would be possible to read in the > DTD automatically, transform to XSchema and display.] ** > > ** Is there any way to extract the complete "DTD" - i.e. external and > internal subsets (but without parameter/general entities) from > SAX/AElfred/XYZPQR? ** I misread this at first. What Peter wants is either (a) the system ID of the external subset, or (b) the stream of bytes in the complete DTD. I assume the reason is so he can automatically feed it to my DTD-to-XSchema converter; currently, he would have to prompt the user for this information. Another alternative is to fix my converter so it can read the DTD directly from the XML file and discards the rest (it currently reads only external subsets). Is there any advantage to working only with external subsets? -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Wed Nov 11 16:22:28 1998 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:06:25 2004 Subject: Open-source XLink library (was Re: XLink - where are we?) In-Reply-To: <3.0.1.16.19981111021455.6f2f8962@pop3.demon.co.uk> References: <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> Message-ID: <199811111622.LAA12163@hesketh.com> After spending the last few months talking and talking about how amazingly cool XLink _could_ be, I've taken a few steps toward _doing_ things with XLink. The steps aren't lovely or stupendous (yet), but I hope that they might represent a step forward. The code is not yet ready for prime time, but the discussion on this list makes it fairly clear that there is a need for this kind of project. Working from the 3/3/98 Working Draft, I've put together a few small Java examples that use XLink and am at work on a library based on John Cowan's ParserFilter class that extracts the linking information from a document, allowing the application to process the document without having to deal with the issues involved in creating and managing links. Right now the library supports simple and extended links, but not hub groups or attribute remapping. I'm planning on adding legacy support for HTML's A element at some point in the future as well. The examples are from Building XML Applications, and were small, simple, one-off applets. They can be found at the bottom of http://www.simonstl.com/buildxml/. They use image maps (my handling of text link proved disastrous) and are unfortunately a bit slow as a result. They pretty much just show what some _very_ simple XLink applications might look like. The source code for these examples will be in the book, and (if I can get the necessary permissions from my editor) may appear online as well. The library is and will be open source, though at this point I haven't firmly decided on a license. (GPL is likely, in some slightly modified form.) I should emphasize that the library is in (fairly ugly) alpha form; the feature set is not yet nearly complete, especially the LinkSet class. The basic XLinkFilter has worked on the samples on which I've tested it, transforming multi-directional links into more workable sets of single direction links. The LinkSet is an extended Vector, but will probably change to something more flexible when Java 1.2 comes along. (I will try to keep a 1.1 version, however.) Information on the XLinkFilter and supporting code is at http://www.simonstl.com/projects/xlinkfilter/. This filter only handles XLink; XPointer is a necessary complement, but another large piece to deal with. Support for a DOM model in addition to SAX is another necessary piece, but one that I haven't yet begun. Documentation at present is weak, and the comments in the code aren't enough. They will improve, and I will add real JavaDoc comments as well. (Promise!) All contributions/modifications/suggestions on functionality, code, licensing, etc. are welcome. I'll be in and out for the next few days, but will be returning to more focused work on this library over the weekend. Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer Cookies / Sharing Bandwidth (November) Building XML Applications (December) http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Curt.Arnold at hyprotech.com Wed Nov 11 17:01:43 1998 From: Curt.Arnold at hyprotech.com (Arnold, Curt) Date: Mon Jun 7 17:06:25 2004 Subject: Schema support in IE5 Message-ID: <61DAD58E8F4ED211AC8400A0C9B468730120FA@THOR> I've just browsed through the Schema support preview stuff on IE5 and was wondering whether it or the DCD note best represents Microsoft's thinking on Schema tags. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Nov 11 17:18:11 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:06:25 2004 Subject: Schema support in IE5 Message-ID: <3.0.32.19981111091752.00b4fdf0@pop.intergate.bc.ca> At 10:01 AM 11/11/98 -0700, Arnold, Curt wrote: >I've just browsed through the Schema support preview stuff on IE5 and was >wondering whether it or the DCD note best represents Microsoft's thinking on >Schema tags. Care to give us a report on what it looks like to someone without preconceptions? -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simpson at polaris.net Wed Nov 11 19:43:31 1998 From: simpson at polaris.net (John E. Simpson) Date: Mon Jun 7 17:06:25 2004 Subject: ID attribute defaults In-Reply-To: <3.0.32.19981111091752.00b4fdf0@pop.intergate.bc.ca> Message-ID: <3.0.5.32.19981111144155.010ea4d0@nexus.polaris.net> I understand why IDs don't typically have defaults, each one required to be unique. OTOH, what about a data-based (not database-based) XML app whose content model, especially at the top of the tree, consists largely of elements which may occur only once in an entire document instance? It seems to me that there's still potential value in assigning such elements a fixed "default" value -- primarily for purposes of internal cross-references. Tim Bray's annotated XML spec says (in the annotaton on production [56] regarding the IDREF validity constraint), "In general, it's a good idea to attach ID attributes to as many elements as possible in every document, because, later, if you decide you need to point at anything, you'll be happy if it has an ID attribute ready for use." Agreed. Later, in the discussion of the #FIXED attribute type (prod. [60]), he says: ....suppose, for example, that I insert a pointer here to my favorite piece of the XML spec. When I do this, I use an element named Sref (for Spec-ref), which becomes an A element in the HTML version. However, in all cases, I want that pointer to get another attribute target='spec'", so that when you use it, it takes effect in the left-hand frame, which is named spec. So that I don't have to enter that attribute in each Sref element, the internal subset contains this: This is exactly what I'd like to do. But this seems like something of a gimmick to reproduce the effect of the ID/IDREF attribute types, one which furthermore requires an app to interpret -- not leveraging the built-in cross-reference facility. What am I missing? Is there some way both to use ID/IDREF *and* not require the document author to supply IDs on her own? Thanks for any brainstorms, John E. Simpson ======================================================== John E. Simpson | It's no disgrace t'be poor, simpson@polaris.net | but it might as well be. http://www.flixml.org | -- "Kin" Hubbard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From joel at spooky.emcs.cornell.edu Wed Nov 11 19:57:33 1998 From: joel at spooky.emcs.cornell.edu (Joel Bender) Date: Mon Jun 7 17:06:25 2004 Subject: XML for Network Topology Message-ID: I'm interested in (1) a DTD for networking, specifically to describe a topology of networks, routers, ports, etc. If one doesn't already exist, contact me and we'll form a splinter group to create one (my guess would be to start with SNMP data and go from there). I am also interested in (2) a DTD for building systems...fans, pumps, valves, boilers, chillers, cooling towers, meters (electric, gas, water, sewer). My mission is to build a suite of applications that can be used for diagnostics (if I shut of pump #3, who will no longer get water? Building #6 lost power, which circuit is that?), analysis, modeling, etc. With "smart buildings" there is an increasing interconnection between components (air-conditioning, fire control, security). Before trying to work on (2) it would be informative to work on (1). I think XML would be an excellent mechanism to transfer this structured data between tools. Joel ------------------------------------------------------------------ Joel Bender Voice: 607-255-8880 Senior Programmer/Analyst FAX: 607-255-5377 Utilities Department 131 Humphreys Service Building Cornell University Ithaca, NY 14853-3701 ------------------------------------------------------------------ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From xmldoubts at hotmail.com Wed Nov 11 20:14:17 1998 From: xmldoubts at hotmail.com (XML Doubts) Date: Mon Jun 7 17:06:25 2004 Subject: Sun XML Lib Message-ID: <19981111201248.7097.qmail@hotmail.com> Hi, Has any one used the Sun XML library. My question is how to access the child nodes from the root. I have tried ....main(String args[]){ .... XmlDocument doc; Element root = doc.getDocumentElement(); System.out.println("\nRoot: "+root.getTagName()); System.out.println("\nFirstChild:"+ (root.getFirstChild()).getNodeName()); ... } The first print statement prints the root element, but the second print statement prints #text which is not what I want. I want to print the node name. Like for example in the XML file Roses are red, The first print statement prints "poem" The second print statement prints #text. I want "line" to be printed. And also how do I get the text inside the "line" tag. This might be trivial question but I am just a beginner. So bear with me. Can anyone help. Thanks. Bye ______________________________________________________ Get Your Private, Free Email at http://www.hotmail.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Nov 11 20:16:52 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:06:25 2004 Subject: ID attribute defaults Message-ID: <3.0.32.19981111120856.00ba33e0@pop.intergate.bc.ca> At 02:41 PM 11/11/98 -0500, John E. Simpson wrote: >I understand why IDs don't typically have defaults, each one required to be >unique. > >OTOH, what about a data-based (not database-based) XML app whose content >model, especially at the top of the tree, consists largely of elements >which may occur only once in an entire document instance? Yes, that's a plausible scenario in which a defaulted ID attribute might be useful. Sorry... XML rules it out. As for my having defaulted the "target" attribute in the annotated spec, the "target" is the *same* on every annotation, that's why it's defaulted. Nothing could be more different from an ID attribute. The annotation doesn't say (maybe should) that the thing that generates the HTML copies through any attributes it doesn't recognize, so the defaulted target="spec" ends up in every one of the hundreds of annotations. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From xmldoubts at hotmail.com Wed Nov 11 20:17:04 1998 From: xmldoubts at hotmail.com (XML Doubts) Date: Mon Jun 7 17:06:25 2004 Subject: Sun XML Lib Message-ID: <19981111201321.2085.qmail@hotmail.com> Hi, Has any one used the Sun XML library. My question is how to access the child nodes from the root. I have tried ....main(String args[]){ .... XmlDocument doc; Element root = doc.getDocumentElement(); System.out.println("\nRoot: "+root.getTagName()); System.out.println("\nFirstChild:"+ (root.getFirstChild()).getNodeName()); ... } The first print statement prints the root element, but the second print statement prints #text which is not what I want. I want to print the node name. Like for example in the XML file Roses are red, The first print statement prints "poem" The second print statement prints #text. I want "line" to be printed. And also how do I get the text inside the "line" tag. This might be trivial question but I am just a beginner. So bear with me. Can anyone help. Thanks. Bye ______________________________________________________ Get Your Private, Free Email at http://www.hotmail.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From harvey at eccnet.eccnet.com Wed Nov 11 20:34:52 1998 From: harvey at eccnet.eccnet.com (Betty Harvey) Date: Mon Jun 7 17:06:25 2004 Subject: XML for Network Topology In-Reply-To: Message-ID: Hi Joel: The company I4I has a DTD they used as a demonstration of the S4 product to show networking. I am not sure how complete the DTD is or whether they would be willing to publicly provide the DTD. Your second question about DTDs for diagnostic systems has been created by DoD for Interactive Electronic Technical Manuals (MIL-PRF-87269). There are two different associated DTD's, one is a generic layer which controls hyperlinking. The other is a content layer which deals with the system/subsystem, etc. These DTDs are full-blown SGML DTDs and are rather complex. MIL-PRF-87269 is currently undergoing revision and is out for comments. One of the items on the revision is supporting XML data. The draft specification can be downloaded from ftp://eccnet.eccnet.com/pub/ietm/87269b.zip if interested. Also http://www.ietm.net contains information about and links to IETMs. Good luck. Betty /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/ Betty Harvey | Phone: 301-540-8251 FAX: 4268 Electronic Commerce Connection, Inc. | 13017 Wisteria Drive, P.O. Box 333 | Germantown, Md. 20874 | harvey@eccnet.eccnet.com | Washington,DC SGML Users Grp URL: http://www.eccnet.com | http://www.eccnet.com/sgmlug/ /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\\/\/ On Wed, 11 Nov 1998, Joel Bender wrote: > I'm interested in (1) a DTD for networking, specifically to describe a > topology of networks, routers, ports, etc. If one doesn't already exist, > contact me and we'll form a splinter group to create one (my guess would be > to start with SNMP data and go from there). > > I am also interested in (2) a DTD for building systems...fans, pumps, > valves, boilers, chillers, cooling towers, meters (electric, gas, water, > sewer). My mission is to build a suite of applications that can be used > for diagnostics (if I shut of pump #3, who will no longer get water? > Building #6 lost power, which circuit is that?), analysis, modeling, etc. > With "smart buildings" there is an increasing interconnection between > components (air-conditioning, fire control, security). > > Before trying to work on (2) it would be informative to work on (1). I > think XML would be an excellent mechanism to transfer this structured data > between tools. > > > Joel > ------------------------------------------------------------------ > Joel Bender Voice: 607-255-8880 > Senior Programmer/Analyst FAX: 607-255-5377 > Utilities Department 131 Humphreys Service Building > Cornell University Ithaca, NY 14853-3701 > ------------------------------------------------------------------ > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simpson at polaris.net Wed Nov 11 20:44:32 1998 From: simpson at polaris.net (John E. Simpson) Date: Mon Jun 7 17:06:25 2004 Subject: ID attribute defaults In-Reply-To: <3.0.32.19981111120856.00ba33e0@pop.intergate.bc.ca> Message-ID: <3.0.5.32.19981111154340.010f3ec0@nexus.polaris.net> Thanks for the reply, Tim. >Yes, that's a plausible scenario in which a defaulted ID attribute >might be useful. Sorry... XML rules it out. As I understood (but hoped otherwise -- or at least hoped that there'd be some trick to accomplish the same goal, without having to create app-specific attributes like Sref to do it). >As for my having defaulted the "target" attribute in the annotated >spec, the "target" is the *same* on every annotation, that's why >it's defaulted. Nothing could be more different from an ID attribute. Yeah, but.... In an element tree in which many elements at the top layer must be one-of-a-kind, the target would be the same for every cross-reference, too. So why not default it (or so I thought) with a fixed ID value? I hesitate to call this an idee fixe. Probably should have hesitated a bit longer. Thanks again. JES ======================================================== John E. Simpson | It's no disgrace t'be poor, simpson@polaris.net | but it might as well be. http://www.flixml.org | -- "Kin" Hubbard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Sung_Nguyen at datacard.com Wed Nov 11 21:19:05 1998 From: Sung_Nguyen at datacard.com (Sung Nguyen) Date: Mon Jun 7 17:06:25 2004 Subject: Get encoding Message-ID: <00159D4E.3096@datacard.com> Hi: How do I get encoding type from an XML doc ? I am using Mircosoft C++ XML parser come with IE4.0; there is no such "get_encoding()" message in XMLDoc interface. Thanks for your helps SeanN xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DKACKMAN at agchem.com Wed Nov 11 21:41:21 1998 From: DKACKMAN at agchem.com (Don Kackman) Date: Mon Jun 7 17:06:25 2004 Subject: DTD inheritance? Message-ID: <8424EFA3C1F7D1118B7300A0C9C57C192084FC@mpl_nt9.agchem.com> Hello, I've got two document types with associated DTDs. The first is a subset of the second. I don't want to merge these two DTDs into one file but also don't want to maintain both files seperately. Is it possible to include one DTD in another through an external entity reference? If so how? Also if so, can I declare additional attributes in the second DTD for elements originally declared in the first? Thanks a lot, Don Kackman dkackman@agchem.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Sung_Nguyen at datacard.com Wed Nov 11 21:52:08 1998 From: Sung_Nguyen at datacard.com (Sung Nguyen) Date: Mon Jun 7 17:06:26 2004 Subject: ID attribute defaults Message-ID: <00159E10.3096@datacard.com> Hi Gurus: In the following XML file - What should I do to get the attributes SRC, WIDTH, HEIGHT and ALT of the element PREVIEW-SMALL? I would like to print out something like this: PREVIEW-SMALL ELEMENT has 3 ATTRIBUTES: SRC="images/burl-s.jpg" WIDTH="300" HEIGHT="194" ALT="Vase and Stones" Thanks you for any helps SeanNg sung_nguyen@datacard.com -------------------------------------------------------------------- Vase and Stones Linda Mann 20x30 inches Oil 1996 3300 John 1315 3200 Andrew 1308 3100 Chris 1307 3000 opening price 1298 1315 This is an example of mixed content xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Sung_Nguyen at datacard.com Wed Nov 11 22:08:28 1998 From: Sung_Nguyen at datacard.com (Sung Nguyen) Date: Mon Jun 7 17:06:26 2004 Subject: parsing the attributes Message-ID: <00159E6A.3096@datacard.com> <<<< Sorry about the previous post to the "ID attribute defaults" thread >> Hi Gurus: In the following XML file - What should I do to parse all attributes of the element PREVIEW-SMALL ? I would like to print out something like this: PREVIEW-SMALL ELEMENT has 3 ATTRIBUTES: SRC="images/burl-s.jpg" WIDTH="300" HEIGHT="194" ALT="Vase and Stones" Thanks you for any helps SeanNg sung_nguyen@datacard.com -------------------------------------------------------------------- Vase and Stones Linda Mann 20x30 inches Oil 1996 3300 John 1315 3200 Andrew 1308 3100 Chris 1307 3000 opening price 1298 1315 This is an example of mixed content xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Wed Nov 11 22:53:54 1998 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:06:26 2004 Subject: DTD inheritance? In-Reply-To: <8424EFA3C1F7D1118B7300A0C9C57C192084FC@mpl_nt9.agchem.com> References: <8424EFA3C1F7D1118B7300A0C9C57C192084FC@mpl_nt9.agchem.com> Message-ID: * Don Kackman | | I've got two document types with associated DTDs. The first is a subset | of the second. I don't want to merge these two DTDs into one file but | also don't want to maintain both files seperately. | | Is it possible to include one DTD in another through an external entity | reference? If so how? In file sub-dtd.dtd: [...lots of declarations...] %base-dtd; where the file base-dtd.dtd contains the declarations common to both DTDs. You will probably have some problems with content models when you try this if sub-dtd.dtd contains element declarations. If so, it may be helpful for you to know that section 4.2 of the spec says: "If the same entity is declared more than once, the first declaration encountered is binding; at user option, an XML processor may issue a warning if entities are declared multiple times." This can be used to make 'hooks' in content models where sub-DTDs can insert the new elements they introduce. | Also if so, can I declare additional attributes in the second DTD | for elements originally declared in the first? Yes, the XML spec explicitly provides for this. In section 3.3 it says: "When more than one AttlistDecl is provided for a given element type, the contents of all those provided are merged." --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Nov 11 23:06:30 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:06:26 2004 Subject: XSchema In-Reply-To: <01BE0D78.74BCA7F0@grappa.ito.tu-darmstadt.de> Message-ID: <3.0.1.16.19981111205003.366f6a2a@pop3.demon.co.uk> At 13:37 11/11/98 +0100, Ronald Bourret wrote: > >If I ever find time to write a validation module, on-the-fly validation should be It'll be real fun - good for your soul... a freebie. I figure the validator will be a SAX parser filter initialized with an XSchema document. The application (e.g. Jumbo) hooks this to a real parser, registers an ErrorHandler, and passes the Stream of XML to be validated; the return value from parse(), along with any values sent to the ErrorHandler, tell the application whether validation succeeded or not. sounds fine. I am personally not keen on validate-after-every-keystroke approach and would expect that I would either work with a single element ("validate the children of this element" and/or "validate the children of the parent of this element" ) or "validate the whole document". If the former, then we simply pass the stream of XML from the node in question (JUMBO can extract this at present). > >> I have used RonB's DTD2XSchema translator - works fine, but I think Ron >> said it needed slight tweaks. > >Still needs updating to the latest spec. Works starts this weekend. Great. > >> ** Is there a way of capturing the external DTD subset in SAX, or AElfred, >> or any other parser? It really only needs to capture the URL? Do I have to >> subclass an Entity handler? In this way it would be possible to read in the >> DTD automatically, transform to XSchema and display.] ** >> >> ** Is there any way to extract the complete "DTD" - i.e. external and >> internal subsets (but without parameter/general entities) from >> SAX/AElfred/XYZPQR? ** > >I misread this at first. Probably my fault - was late at night. What Peter wants is either (a) the system ID of the external subset, or (b) the stream of bytes in the complete DTD. I assume the reason is so he can automatically feed it to my DTD-to-XSchema converter; currently, he would have to prompt the user for this information. Exactly right. At present I can read in a complete foo.xsc and (in some mindless moment) bolt in Ron's tool so it can read foo.dtd. But these have to be actuated manually ("Edit | Read XSchema") on menu. And it will be an invalid approach if there is an internal subset. > >Another alternative is to fix my converter so it can read the DTD directly from the XML file and discards the rest (it currently reads only external subsets). Is there any advantage to working only with external subsets? There is no advantage to me in working with external subsets except that I can already do it :-) many thanks Ron. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Nov 11 23:06:35 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:06:26 2004 Subject: Open-source XLink library (was Re: XLink - where are we?) In-Reply-To: <199811111622.LAA12163@hesketh.com> References: <3.0.1.16.19981111021455.6f2f8962@pop3.demon.co.uk> <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> Message-ID: <3.0.1.16.19981111205328.370f5328@pop3.demon.co.uk> At 11:24 11/11/98 -0500, Simon St.Laurent wrote: >After spending the last few months talking and talking about how amazingly >cool XLink _could_ be, I've taken a few steps toward _doing_ things with >XLink. The steps aren't lovely or stupendous (yet), but I hope that they >might represent a step forward. The code is not yet ready for prime time, >but the discussion on this list makes it fairly clear that there is a need >for this kind of project. Many thanks. I haven't had time to look at this but this seems like a useful way of exploring what XLink can do although it's not an API or a library. Keep it going. I think we shall also need some XLink APIs and/or library routines to help with some of the processing required... P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Nov 11 23:06:40 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:06:26 2004 Subject: Oracle XML In-Reply-To: <3649088C.3EADA703@infinet.com> References: Message-ID: <3.0.1.16.19981111210812.366ffdf6@pop3.demon.co.uk> At 22:46 10/11/98 -0500, Tyler Baker wrote: >It is nice to see that they are supporting SAX as well. I think the question now with XML tools in regard to >SAX is "who does not support SAX"? Agreed - the SAXogenists on this list can take a warm feeling away. Remember that there were people who warned us "SAX is a waste of time" - this is a W3C activity... The same could be said of XSchema. There seems now to be no reasons why (after testing) we shouldn't see XSchema become widely used for authoring/editing/validating. It's there - it does a useful job - let's see it included in all the major tools. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Nov 11 23:06:39 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:06:26 2004 Subject: XLink - where are we? [tiny amount of frustration] In-Reply-To: <3.0.5.32.19981110221631.008ec900@amati.techno.com> References: <3.0.1.16.19981111021455.6f2f8962@pop3.demon.co.uk> <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> Message-ID: <3.0.1.16.19981111210320.302765a0@pop3.demon.co.uk> At 22:16 10/11/98 -0600, W. Eliot Kimber wrote: >At 02:14 AM 11/11/98, Peter Murray-Rust wrote: > >>(c) it appears to require all applications to have a stylesheet - i.e. >>there is no default display and it's therefore primarily for rendering XML >>as text with hyperlinks. > >Not trying to be difficult, but how could any useful form of default >presentation other than running everything together or making every element >a separate block be provided by a DTD-inspecific tool? I don't know of any >generalized SGML tool that provides any useful sort of default styling. The >best is Arbortext's Document Architect, which provides a sort of wizard by >which it asks a bunch of questions and makes some informed guesses and then >generates a style sheet that can be a reasonable starting place (it can >also be more trouble than it's worth depending on what sort of DTD you >happen to have). I am probably tilting at windmills but I am attempting something like this in JUMBO - I have about 5 approaches to styles without stylesheets. (a) redisplay it as raw XML. Not as silly as it sounds for many documents. (b) pretty-print it and display as XML. Extremely useful for many documents (c) Reformat start tags as bold and add NL after PCDATA. Works pretty well for many documents. (d) map every element onto a Java class. (e) allow the user to customise some or all elements with styles. I shall use Swing for the rendering. Then, I suppose I could write the styles out as XSL if anyone cares. I get the impression that many people regard the reformatting of XML documents for human readers as the only valid thing to do with XML. It's important, but it's not going to lead to innovation. I'd much rather see structured *graphics* being addressed than stylesheets. Far more exciting. - Yes I know there is movement in this area. I'd also like to see *some* movement on 'behaviour' - how to we create an interactive document rather than simply decide on the best way to send it to the printer (which will be 99% of the use of XSL). P. > >Cheers, > >E. >-- >
>W. Eliot Kimber, Senior Consulting SGML Engineer >ISOGEN International Corp. >2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 >www.isogen.com >
> >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Wed Nov 11 23:25:34 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:06:26 2004 Subject: Sun XML Lib In-Reply-To: <19981111201248.7097.qmail@hotmail.com> Message-ID: <3.0.1.16.19981112002428.366f5ce0@pop3.demon.co.uk> Dear XML Doubts, Unless I do you and your parents an injustice, "XML Doubts" is a pseudonym (and one at variance with an apparently useful contribution to the list). Whilst it is/was common in Past Times (e.g. "A Tangled Tale" by Lewis Carroll) (and today in MOOs), pseudonyms aren't really appropriate for XML-DEV. At 12:12 11/11/98 PST, XML Doubts wrote: >Hi, > >Has any one used the Sun XML library. My question is how to access the If your questions is about xml-ea1, yes - I have used it, but I can't answer the question. Your approach seems reasonable. > >And also how do I get the text inside the "line" tag. >This might be trivial question but I am just a beginner. So bear with >me. This is a perfectly reasonable question for XML-DEV as it relates to an Early Adopter (i.e. recently released) package. P. PS. There are no doubts about XML. It is one of the great achievements of the human intellect. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Thu Nov 12 00:41:18 1998 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:06:26 2004 Subject: Sun XML Lib References: <19981111201248.7097.qmail@hotmail.com> Message-ID: <364A2D74.FFF8FF0F@eng.sun.com> XML Doubts wrote: > > ....main(String args[]){ > .... > XmlDocument doc; (presumably, "doc" got assigned ... !) > Element root = doc.getDocumentElement(); > System.out.println("\nRoot: "+root.getTagName()); > System.out.println("\nFirstChild:"+ > (root.getFirstChild()).getNodeName()); > ... > } > The first print statement prints the root element, but the second print > statement prints #text which is not what I want. I want to print the > node name. The node is text, and the DOM spec says that its name is "#text" ... I'd suspect it's a text node with whitespace (a blank line) if the input is your next example. Try printing the names of each child, and you'll probably see "#text", "line", "#text". The next release (no, I won't say when!) should make it easier to ignore whitespace that you really don't want to see, but which the XML spec requires you to be able to see. (Like that blank line.) > Like for example in the XML file > > > Roses are red, > > > The first print statement prints "poem" > The second print statement prints #text. > I want "line" to be printed. You'll need to get the first node that's an element, rather than the first node ... since the first node is whitespace. > And also how do I get the text inside the "line" tag. Get the (element) node whose name is "line". Get its children. > This might be trivial question but I am just a beginner. So bear with > me. > > Can anyone help. These questions relate to DOM -- you'd surely have the same questions with any DOM implementation! Sun's package has no tutorial as yet, but there are probably some on the network somewhere. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Thu Nov 12 02:01:41 1998 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:06:26 2004 Subject: XML and IE5 beta PR2 Message-ID: <199811120156.RAA25112@boethius.eng.sun.com> [Paul Spencer:] | I agree that the current IE5 implementation of XSL only does tag | transformation. However, unless I have misunderstood CSS, I don't think | your conclusion follows. | | I have used IE5 XSL to do reasonably complex formatting of XML | documents, including taking elements out of order, and displaying images | (URL in the XML) with hyperlinks to documents (also with their URL in | the XML). With a bit of judicious pre-processing of the XML using the | DOM, I also display another image a number of times according to a | number in the XML document. That's not complex. Complex is multiple columns, interleaved column sets with multiple text flows, footnote zones, synchronized marginalia, math formatting, mixed vertical and horizontal writing directions -- in other words, the stuff you need in order to do genuinely internationalized automated print publishing. The goal is to have a single language that, once mastered, can be used for any kind of formatting -- a language powerful enough to support the high-quality automated layout of any amount of material that conforms to a given DTD or document schema and modular enough to share the bulk of a complex stylesheet across both print and online versions. Only when you have a single stylesheet language that can replace the proprietary style and layout formats of programs like Quark Express, FrameMaker, PageMaker, and Word will you be able to achieve completely functional and transparent document interchange across applications. And only when you have a language equally capable of supporting both print and online display will you be able to build the common training infrastructure that can create the shared set of human resources needed for media-independent publishing in the next century. That language must be primarily declarative, so that stylesheets can be interchanged between different interactive stylesheet editors, and it must be able to scale from a one-off memo on up to the Yellow Pages, the New York Times, and the L. L. Bean catalogue -- both the online and print versions. A language that supports only the production of HTML can't do that. Jon xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ckthum at krdl.org.sg Thu Nov 12 02:44:06 1998 From: ckthum at krdl.org.sg (Thum Ching Kuan) Date: Mon Jun 7 17:06:26 2004 Subject: Sun XML Lib In-Reply-To: <19981111201321.2085.qmail@hotmail.com> Message-ID: > Hi, > > Has any one used the Sun XML library. My question is how to access the > child nodes from the root. > I have tried > > ....main(String args[]){ > .... > XmlDocument doc; > Element root = doc.getDocumentElement(); > System.out.println("\nRoot: "+root.getTagName()); > System.out.println("\nFirstChild:"+ > (root.getFirstChild()).getNodeName()); > ... > } > The first print statement prints the root element, but the second print > statement prints #text which is not what I want. I want to print the > node name. You can use TreeWalker in the Sun XML lib to access a node by its name. If you want to traverse the DOM tree, you have to handle the white space. Try to implement a recursive method and print out all the child nodes with its value, take a look at IBM XML4J example. For your case, try Node sibling = root.getFirstChild(); sibling = sibling.getNextSibling(); System.out.println((sibling=sibling.getNextSibling.getFirstChild()).getNodeValue()); You should be able to see "Roses are red," now. cheers, Thum xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at dns.isogen.com Thu Nov 12 03:06:30 1998 From: eliot at dns.isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:06:26 2004 Subject: ID attribute defaults In-Reply-To: <3.0.5.32.19981111144155.010ea4d0@nexus.polaris.net> References: <3.0.32.19981111091752.00b4fdf0@pop.intergate.bc.ca> Message-ID: <3.0.5.32.19981111204941.009bc9d0@amati.techno.com> At 02:41 PM 11/11/98 -0500, John E. Simpson wrote: >What am I missing? Is there some way both to use ID/IDREF *and* not require >the document author to supply IDs on her own? It's my personal feeling that the inability to provide defaults for ID, IDREF(S), and ENTITY(IES) attributes is a bug (in SGML unavoidably inherited by XML). Document type designers should be able choose for themselves what does and doesn't make sense in their applications. Cheers, E. --
W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at dns.isogen.com Thu Nov 12 03:06:32 1998 From: eliot at dns.isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:06:26 2004 Subject: XLink - where are we? [tiny amount of frustration] In-Reply-To: <3.0.1.16.19981111210320.302765a0@pop3.demon.co.uk> References: <3.0.5.32.19981110221631.008ec900@amati.techno.com> <3.0.1.16.19981111021455.6f2f8962@pop3.demon.co.uk> <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> Message-ID: <3.0.5.32.19981111210459.009ba510@amati.techno.com> At 09:03 PM 11/11/98, Peter Murray-Rust wrote: >I am probably tilting at windmills but I am attempting something like this >in JUMBO - I have about 5 approaches to styles without stylesheets. >(a) redisplay it as raw XML. Not as silly as it sounds for many documents. >(b) pretty-print it and display as XML. Extremely useful for many documents >(c) Reformat start tags as bold and add NL after PCDATA. Works pretty well >for many documents. >(d) map every element onto a Java class. >(e) allow the user to customise some or all elements with styles. I shall >use Swing for the rendering. Then, I suppose I could write the styles out >as XSL if anyone cares. I'm so focused on formatted presentation of the *content* I didn't think about these options. The first three sound quite reasonable. The last two are no different from requiring a style sheet--someone still has to provide a per-document or document type definition of what the styling should be--whether you use Jave code or DSSSL specs or XSL doesn't matter--the task is the same. >- Yes I know there is movement in this area. I'd also like to see *some* >movement on 'behaviour' - how to we create an interactive document rather >than simply decide on the best way to send it to the printer (which will be >99% of the use of XSL). There is the ISMID (Interchange Standard for Modifiable Interactive Documents) that is being developed in SC34 of ISO/IEC JTC1. It's goal is to be a relatively simple application that lets you define more or less procedural behaviors for information objects (which could be XML documents or components thereof). It's still being firmed up, but it seems reasonably promissing. I know the editors would certainly be interested in review and comment. We've been reviewing the latest draft this week during the ISO meeting. You can find the latest draft at . As a reviewer, I've asked the editors to do a few things that will make the spec a bit more XML friendly. Cheers, Eliot --
W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu Nov 12 04:58:13 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:06:27 2004 Subject: Endtag attributes References: <36474321.DE4EA5A3@toolsmiths.se> Message-ID: <364A665A.EA3B428D@technologist.com> Lars Marius Garshol wrote: > > The only pro I can think of is pretty far-fetched: > > - might be easier to generate markup from some kinds of applications, > since attribute values would not have to be determined until after > the element contents had been written out I think that optional redundant end-tag attributes would be useful, for the same reason that the redundant end-tag GI (which should be optional) is useful:
10,000 lines of text
10,000 lines of text
10,000 lines of text
Paul Prescod - http://itrc.uwaterloo.ca/~papresco At today's pop doubling rates, in 100 years there will be 20 billion people, more than enough to fill the earth. In 300 years, we will have filled up 16 earth-sized planets (roughly, our solar system). In 2300 years we will have filled up 200 billion earth-sized planets (roughly, our galaxy). Only one technology can save us: birth control. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu Nov 12 05:01:55 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:06:27 2004 Subject: Full Disclosure: What The XML Processor Must Tell The XML Application References: <3641FCE4.61D46D15@locke.ccil.org> Message-ID: <364A666C.4D938DD4@technologist.com> John Cowan wrote: > > 10. An XML processor (necessarily a non-validating one) that does > not include the replacement text of an external parsed entity > in place of an entity reference must notify the application that > it recognized but did not read the entity (4.4.3) [SAX does not > provide for this] I think David agrees with me that XML processors do not have enough information to make the decision about whether entity references should be expanded or not. If an author wants an inter-entity reference that can be expanded "lazily", they should use XLink and describe the behaviour they want in a stylesheet. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "The new revolutionaries believe the time has come for an aggressive move against our oppressors. We have established a solid beachhead on Friday. We now intend to fight vigorously for 'casual Thursdays.' -- who says America's revolutionary spirit is dead? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu Nov 12 05:02:22 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:06:27 2004 Subject: Full Disclosure: What The XML Processor Must Tell The XML Application References: <3641FCE4.61D46D15@locke.ccil.org> Message-ID: <364A6634.4E060F06@technologist.com> John Cowan wrote: > > 10. An XML processor (necessarily a non-validating one) that does > not include the replacement text of an external parsed entity > in place of an entity reference must notify the application that > it recognized but did not read the entity (4.4.3) [SAX does not > provide for this] I think David agrees with me that XML processors do not have enough information to make the decision about whether entity references should be expanded or not. If an author wants an inter-entity reference that can be expanded "lazily", they should use XLink and describe the behaviour they want in a stylesheet. Paul Prescod - http://itrc.uwaterloo.ca/~papresco "The new revolutionaries believe the time has come for an aggressive move against our oppressors. We have established a solid beachhead on Friday. We now intend to fight vigorously for 'casual Thursdays.' -- who says America's revolutionary spirit is dead? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Thu Nov 12 05:54:52 1998 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:06:27 2004 Subject: XML and IE5 beta PR2 References: <000001be0c0b$e89672a0$020110ac@office> <36473C9A.2E09D648@eng.sun.com> Message-ID: <364A77D3.22BF85E7@jclark.com> David Brownell wrote: > > Paul Spencer wrote: > > > > Mind you, I don't think MS is correctly implementing the August 18 > > [XSL] draft, but that is another matter, and I could be wrong. > > With "xsl:eval", "xsl:script" and many other proprietary extensions, > yet without even "xsl:process-children", I think you're clearly right. > W3C's XSL pattern syntax is there, but not a lot else. In fairness to Microsoft, I think it should be pointed out that some of the differences between what Microsoft has implemented and the Aug 18 draft correspond to changes that have been decided by the XSL WG since the Aug 18 draft. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu Nov 12 07:56:24 1998 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 17:06:27 2004 Subject: Stylesheets considered limiting (was Re: XLink - where are we?) In-Reply-To: <3.0.5.32.19981111210459.009ba510@amati.techno.com> References: <3.0.1.16.19981111210320.302765a0@pop3.demon.co.uk> <3.0.5.32.19981110221631.008ec900@amati.techno.com> <3.0.1.16.19981111021455.6f2f8962@pop3.demon.co.uk> <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> Message-ID: <3.0.1.16.19981112085340.36af5dac@pop3.demon.co.uk> At 21:04 11/11/98 -0600, W. Eliot Kimber wrote: >I'm so focused on formatted presentation of the *content* I didn't think >about these options. The first three sound quite reasonable. The last two Thanks :-). I think there is a "feature" of the XML community that many of the founders are heavily focussed on conventional paper-based formatting. Jon Bosak's recent mail puts that very clearly. The design features of Hybrick are also clearly geared towards that - it looks like a non-trivial task for me to adapt it to chemical reactions, for example. [I may be wrong, but without source and the very restrictive license I can't judge]. Nothing wrong with this view. People (even me) want to read paper. But I certainly feel that almost everything else is seen as lower priority than the holy grail of high-performance text formatting. And the policy makers in the W3C community are very much oriented towards paper-like operations. 2 years since the conception of XML and we cannot (with general agreement on interoperability): - send a hyperlink over the wire - send a button over the wire - since a date over the wire - send a structured graphic over the wire We *can* send a mathematical equation. This may sound unfair and - yes, I know - there are many things in the pipeline but IMO the innovation that we saw 18 months ago is disappearing. Essentially the vision of the XML process is often something like: server-side: original data -> XML ->+XSL-> output format (e.g. PDF/HTML/PS) PDF/HTML is sent over the wire. The PS is printed and put in the mail. client-side: reader receives a dumb document Yes, I know that there are XML-aware browsers and client side stylesheets but the purpose of what Jon was describing is often so they can send it straight to their local printer. >are no different from requiring a style sheet--someone still has to provide >a per-document or document type definition of what the styling should >be--whether you use Jave code or DSSSL specs or XSL doesn't matter--the >task is the same. Nope. In Java the task for me is: - to identify how to map Java classes onto elements. I haven't followed XSL but I assume it gives little or no help. - how to create different on-line displays interactively. I'd like to insert buttons where there are hyperlinks, for example. Is all that in XSL. My concern about the word "stylesheet" is that it restricts our vision of what is possible. And that the excitement of the whole XML effort will disappear. Yes - I know XML is meant to be boring as many people have said and sending pseudo-paper over the wire is not - yet - very exciting. I am making the assumptions that a large number of XML documents will never have stylesheets. Very few do at the moment. So JUMBO was designed with the idea that we will have to cope with documents **that do not have stylesheets**. The current mainstream view - as exemplified by Hybrick - seems to be that such a document is broken. I don't accept this :-) Maybe I am just a niche player but it's a very large and exciting niche and anyone who is interested is welcome to play :-) > >>- Yes I know there is movement in this area. I'd also like to see *some* >>movement on 'behaviour' - how to we create an interactive document rather >>than simply decide on the best way to send it to the printer (which will be >>99% of the use of XSL). > >There is the ISMID (Interchange Standard for Modifiable Interactive >Documents) that is being developed in SC34 of ISO/IEC JTC1. It's goal is to >be a relatively simple application that lets you define more or less >procedural behaviors for information objects (which could be XML documents >or components thereof). It's still being firmed up, but it seems reasonably >promissing. I know the editors would certainly be interested in review and >comment. We've been reviewing the latest draft this week during the ISO >meeting. You can find the latest draft at >. As a reviewer, I've asked >the editors to do a few things that will make the spec a bit more XML >friendly. Does this imply that it's not XML-compatible at present? Because I suspect people will get increasingly frustrated with SGML-over-the-wire (e.g. Hybrick at present) because they don't know how to manage declarations and catalogs. P. > >Cheers, > >Eliot >-- >
>W. Eliot Kimber, Senior Consulting SGML Engineer >ISOGEN International Corp. >2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 >www.isogen.com >
> >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From RJA at arpsolutions.demon.co.uk Thu Nov 12 08:21:57 1998 From: RJA at arpsolutions.demon.co.uk (Richard Anderson) Date: Mon Jun 7 17:06:27 2004 Subject: Oracle XML Message-ID: <012c01be0e15$91bae7c0$c5010180@p197> >Agreed - the SAXogenists on this list can take a warm feeling away. SAX is great. I've now got an ActiveX version and a C++ version going and I really love it. ( not 100% yet ) For interest, have there been any developments of a 'lower level' SAX to expose DTD info too ? In my XML toolkit I want to expose events that enable the caller to use information from the DTD however *they* want. I will of course provide a high level interface too, but it would be nice to have a stream of events like: startDocType startEntity startElementDef endElementDef startAttribDef startAttribElementSeq endAttribElementSeq startAttribElementSeq endAttribElementSeq endAttribDef endEntity endDoctype You get the basic idea. Regards, Richard. *********************************************** * E-Mail mailto:RJA@arpsolutions.demon.co.uk * * WEB http://www.arpsolutions.demon.co.uk * *********************************************** xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From alberto.reggiori at jrc.it Thu Nov 12 08:27:46 1998 From: alberto.reggiori at jrc.it (Alberto Reggiori) Date: Mon Jun 7 17:06:27 2004 Subject: XML for Network Topology References: Message-ID: <364A9BEA.E9E84459@jrc.it> Joel Bender wrote: > > I'm interested in (1) a DTD for networking, specifically to describe a > topology of networks, routers, ports, etc. If one doesn't already exist, > contact me and we'll form a splinter group to create one (my guess would be > to start with SNMP data and go from there). > Have a look at http://www.dmtf.org/cim/cim_xml/index.html They are working at a meta-meta-model level ;-) regards Alberto xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From graham.moore at dpsl.co.uk Thu Nov 12 09:08:15 1998 From: graham.moore at dpsl.co.uk (Graham Moore) Date: Mon Jun 7 17:06:27 2004 Subject: XLink - where are we? [tiny amount o Message-ID: > Eliot wrote >The last two [java class for elements / swing classes] are no different from requiring a style sheet--someone still has to > provide > a per-document or document type definition of what the styling should > be The functional binding / class => element could be done as a default. Look in the location where the document was acquired from using the element name as the class name. It would just work, with no additional configuration files. If no class exists then the node is just a node. cheers, graham. gdm@dpsl.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Thu Nov 12 09:48:44 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:06:27 2004 Subject: XLink - where are we? [tiny amount of frustration] Message-ID: <00c201be0e21$af6209a0$1b11e391@mhklaptop.bra01.icl.co.uk> >- Yes I know there is movement in this area. I'd also like to see *some* >movement on 'behaviour' - how to we create an interactive document rather >than simply decide on the best way to send it to the printer (which will be >99% of the use of XSL). And this reflects a lingering concern I have about XLink, which is that it is putting display-time behaviour into the XML document rather than into the stylesheet. I'm not convinced that XLink is defining relationships at a high enough level of abstraction, and I'd prefer to see work on interactive behaviour happen in the XSL world. In particular, I am really uncomfortable with the one-to-one mapping of stored XML documents to "units of display". I think the presentation facilities (stylesheets and hyperlink browsing) should be independent of the granularity of storage. I tried to illustrate this concept with my HTML rendition of the New Testament http://www.wokchorsoc.freeserve.co.uk/bible-nt/index.html ) where one XML document is rendered as many HTML documents within a navigable frameset. Here of course there are no XLink's at all: the document is purely hierarchic, and the interactive behaviour is inferred from the intrinsic structure of the XML. I don't see who in the XSL / XLink world is trying to make such a rendition easy to define. Conceptually, I'm sure it can be done by using frames as flow objects: there is, of course, the little problem of the "unit of download" if you want to do it client-side. (I did it, of course, with a server-side SAXON application). To put things another way, if I'm going to have to pre-process my corpus by splitting it into lots of linked page-sized chunks to make it browsable, I might as well render those chunks in HTML while I'm about it. Ralph - it would be nice to see what Hytime can achieve with the New Testament example. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Philippe.Le_Hegaret at sophia.inria.fr Thu Nov 12 09:54:22 1998 From: Philippe.Le_Hegaret at sophia.inria.fr (Philippe Le Hégaret) Date: Mon Jun 7 17:06:27 2004 Subject: Oracle XML References: Message-ID: <364AB030.EFAC2ADE@sophia.inria.fr> K. Travis Walsh wrote: > > Oracle officially anounced today it's XML strategy. > > For those of you who didn't know there is a java parser built into this > new release of Oracle 8i. Like netscape, They can't write well-formed XML document. is not a well-formed XML document. http://www.oracle.com/xml/documents/xml_twp/ (Figure 2) Philippe. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Thu Nov 12 09:57:53 1998 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:06:27 2004 Subject: XLink - where are we? [tiny amount o References: > Message-ID: <364AAE1E.84704C93@eng.sun.com> Graham Moore wrote: > > Eliot wrote > > > The last two [java class for elements / swing classes] are no different > > from requiring a style sheet--someone still has to provide > > a per-document or document type definition of what the styling should > > be > > The functional binding / class => element could be done as a default. Look > in the location where the document was acquired from using the element name > as the class name. It would just work, with no additional configuration > files. If no class exists then the node is just a node. That doesn't seem sufficient to me. What's the package name? What about classes that should represent multiple element types? What about the different semantics associated with different namespaces? Suppose you want the element and class names to be different, perhaps because you and your users work with different natural languages? (And I'll confess to having deferred the homework to finding out exactly how XML names and Java names differ! I'd expect them not to be the same.) The model I'm working with right now does include a mapping of name to class, but it's not direct (e.g. can be many to one) and is aware of namespaces. It can be statically configured (e.g. an XML element containing mappings -- embedded, or in a separate document) as well as algorithmically configured (i.e. a factory class). - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From graham.moore at dpsl.co.uk Thu Nov 12 11:18:14 1998 From: graham.moore at dpsl.co.uk (Graham Moore) Date: Mon Jun 7 17:06:27 2004 Subject: Back to XObjects was XLink - where a Message-ID: > - Dave wrote >>>>>>>>> That doesn't seem sufficient to me. What's the package name? What about classes that should represent multiple element types? What about the different semantics associated with different namespaces? Suppose you want the element and class names to be different, perhaps because you and your users work with different natural languages? (And I'll confess to having deferred the homework to finding out exactly how XML names and Java names differ! I'd expect them not to be the same.) The model I'm working with right now does include a mapping of name to class, but it's not direct (e.g. can be many to one) and is aware of namespaces. It can be statically configured (e.g. an XML element containing mappings -- embedded, or in a separate document) as well as algorithmically configured (i.e. a factory class). <<<<<<<<<<<<< Its completely insufficient. But it is a base level at which things will work. The package name can be ingnored as classes dont have to be in a package. (bad idea - but allowed) or a default could be used. The model I have working uses XML config files to specfiy the binding and allows the specification of more powerful delegation structures than simply inheriting from the w3c.node. I guess my point really comes back to having a variety of interfaces on the domBuilder that allow for as much or as little binding configuration as required. I think we need to pick up the XObjects discussion on the domBuilder interfaces. If this sounds like a good idea I'll sift through the contributions of interfaces / part interfaces and present those as a starting point? graham. gdm@dpsl.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Thu Nov 12 12:38:23 1998 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:06:27 2004 Subject: XLink - where are we? [tiny amount o References: > <364AAE1E.84704C93@eng.sun.com> Message-ID: <364AD6E0.E7B489A2@mecomnet.de> David Brownell wrote: > > Graham Moore wrote: > > > > Eliot wrote > > > > > The last two [java class for elements / swing classes] are no different > > > from requiring a style sheet--someone still has to provide > > > a per-document or document type definition of what the styling should > > > be > > > > The functional binding / class => element could be done as a default. Look > > in the location where the document was acquired from using the element name > > as the class name. It would just work, with no additional configuration > > files. If no class exists then the node is just a node. > > That doesn't seem sufficient to me. What's the package name? a mapping from the namespace name. (java packages may require an intermediate) > What about classes that should represent multiple element types? specialize the class appropriately > What about the different semantics associated with different namespaces? that's what packages are for. > Suppose you want the element and class names to be different, perhaps because > you and your users work with different natural languages? support an architectural attribute, with appropriate defaults in the attribute definitions. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at dns.isogen.com Thu Nov 12 14:31:12 1998 From: eliot at dns.isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:06:27 2004 Subject: XLink - where are we? [tiny amount of frustration] In-Reply-To: <00c201be0e21$af6209a0$1b11e391@mhklaptop.bra01.icl.co.uk> Message-ID: <3.0.5.32.19981112082831.008ce3f0@amati.techno.com> At 09:49 AM 11/12/98 -0000, Michael Kay wrote: >To put things another way, if I'm going to have to >pre-process my corpus by splitting it into lots of linked >page-sized chunks to make it browsable, I might as well >render those chunks in HTML while I'm about it. > >Ralph - it would be nice to see what Hytime can achieve with >the New Testament example. >From a HyTime standpoint, it's purely a matter of presentation style and browser implementation. For example, the HyBrowse HyTime browser from TechnoTeacher was implemented so that whatever you link to is always presented in isolation whenever you traverse to it. So if you link to a particular chapter, that's all you'll see when you follow the link. But this behavior is not required (or even suggested) by the HyTime standard--it's a decision the implementors of HyBrowse made, for whatever reason. They could just as easily have made the opposite decision, or, like DynaText, given you a way to define the rendering scope in the style sheet. The communication between client and server in a generalized hypermedia environment cannot be standardized to the degree that a given link or form of address defines or implies a particular communication sequence. The most you can do is provide a way for clients to communicate requirements to servers and provide a common *abstraction* for the data being communicated so that the appropriate amount of data can be transmitted, regardless of whether or not it happens to be a single XML document. Groves are one such abstraction. The "fragment interchange" group is trying to solve the same problem by enabling the transimission of syntactic data chunks that are not complete documents. While this is a useful practical shortcut, it cannot replace the transmission of document abstractions in the general case (this is because there are some processes that require knowledge of or access to the entire document in order to complete rendition, although it's unlikely these types of processes will be used for Web-based applications very often). So, HyTime can't offer any solution to the problem *by itself*, although it provides a pretty solid framework in which a solution could be defined. But the solution has to come from client and server providers because it is ultimately a problem of client-to-server communication, not hyperlink and address representation in documents (which is all HyTime and Xlink talk about). Cheers, E. --
W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at dns.isogen.com Thu Nov 12 14:31:14 1998 From: eliot at dns.isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:06:28 2004 Subject: Stylesheets considered limiting (was Re: XLink - where are we?) In-Reply-To: <3.0.1.16.19981112085340.36af5dac@pop3.demon.co.uk> References: <3.0.5.32.19981111210459.009ba510@amati.techno.com> <3.0.1.16.19981111210320.302765a0@pop3.demon.co.uk> <3.0.5.32.19981110221631.008ec900@amati.techno.com> <3.0.1.16.19981111021455.6f2f8962@pop3.demon.co.uk> <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> Message-ID: <3.0.5.32.19981112081552.0092d600@amati.techno.com> At 08:53 AM 11/12/98, Peter Murray-Rust wrote: >>are no different from requiring a style sheet--someone still has to provide >>a per-document or document type definition of what the styling should >>be--whether you use Jave code or DSSSL specs or XSL doesn't matter--the >>task is the same. > >Nope. In Java the task for me is: > - to identify how to map Java classes onto elements. I haven't followed >XSL but I assume it gives little or no help. > - how to create different on-line displays interactively. I'd like to >insert buttons where there are hyperlinks, for example. Is all that in XSL. But given that your browser provides a way to do the association, the task is to *create the Java classes*, which is conceptually identical to creating a style rule in a DSSSL or XSL specification, although the details of the specification are of course different. Style sheets are nothing more than a mostly-declarative form of program. Here's how I would insert a "button" into the display of a document using DSSSL and HyBrick: (element simple ; Rule for any Xlink element (make link destination: (resolve-location-address (attribute-string "HREF" (current-node))) (make external-graphic entity-system-id: "buttons/clickhere.gif" notation-system-id: "")))) With Java, I might have a button class that I map elements of type "simple" to, but the result would be indistinguishable from the above (all other things being equal). Providing a sexier type of button would require support within the presenation system for them. For example, say my online DSSSL renderer supports program objects as a notation for external graphics. I could do this: (element hyperlink (make link destination: (resolve-location-address (attribute-string "HREF" (current-node))) (make external-graphic entity-system-id: "{BACD-0000-0000-BCDEF-00}" notation-system-id: "WINCOM")))) Now my button will be a program object rather than a static graphic, but notice that the style definition has not otherwise changed--I've just taken advantage of some facility of my presentation system. Another alternative would be to define a new flow object class that provides the interactive features you want. Say, for example, you want a button that will appear to push. We (as a community or as a set of developers) could define a new flow object class called, for example, "interactive-button": (element hyperlink (make link destination: (resolve-location-address (attribute-string "HREF" (current-node))) (make interactive-button behavior: 'push ; Alternatives might be rollover, flash, etc. image-system-id: "buttons/clickhere.gif"))) Now we've abstracted the idea of "button" in a way that can be reasonably interchanged and implemented in a variety of ways. The degree to which it is interchangable is the degree to which different providers of DSSSL-based rendering systems implement it. I've used DSSSL syntax because that's what I know, but the same approach should apply to XSL as well (at least to the degree that XSL supports the definition of new flow object classes, which it may or may not, I don't know). Cheers, E. --
W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at dns.isogen.com Thu Nov 12 14:31:17 1998 From: eliot at dns.isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:06:28 2004 Subject: XLink - where are we? [tiny amount o In-Reply-To: > Message-ID: <3.0.5.32.19981112081655.0089bbf0@amati.techno.com> At 08:56 AM 11/12/98 +0000, Graham Moore wrote: >Eliot wrote > >>The last two [java class for elements / swing classes] are no different >from requiring a style sheet--someone still has to >> provide >> a per-document or document type definition of what the styling should >> be > >The functional binding / class => element could be done as a default. Look >in the location where the document was acquired from using the element name >as the class name. It would just work, with no additional configuration >files. If no class exists then the node is just a node. My point is that *someone has to write the classes*. Cheers, E. --
W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at dns.isogen.com Thu Nov 12 14:34:40 1998 From: eliot at dns.isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:06:28 2004 Subject: Stylesheets considered limiting (was Re: XLink - where are we?) In-Reply-To: <3.0.1.16.19981112085340.36af5dac@pop3.demon.co.uk> References: <3.0.5.32.19981111210459.009ba510@amati.techno.com> <3.0.1.16.19981111210320.302765a0@pop3.demon.co.uk> <3.0.5.32.19981110221631.008ec900@amati.techno.com> <3.0.1.16.19981111021455.6f2f8962@pop3.demon.co.uk> <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> Message-ID: <3.0.5.32.19981112075504.008d5b10@amati.techno.com> At 08:53 AM 11/12/98, Peter Murray-Rust wrote: >Does this imply that it's not XML-compatible at present? Because I suspect >people will get increasingly frustrated with SGML-over-the-wire (e.g. >Hybrick at present) because they don't know how to manage declarations and >catalogs. It's perfectly XML compatible, but the draft as printed provides architectural DTD declarations that assume the use of typical SGML features (i.e., markup minimization) and does not provide an alternative set of declarations that restricts itself to those SGML features used by XML. Obviously anyone could create those declarations (as I did for HyTime in developing my PHyLIS tool), but there's no reason for the editors not to provide it directly given that it's an obvious requirement (and in fact, it should have always been the policy that two forms of architectural declarations be provided, one for SGML declarations that allow markup minimization and one for SGML declarations that do not, which is the key difference {whether or not start and end tag parameters are prohibited}). The concepts defined in the standard are independent of syntax. If you're simply implementing the semantics but not using true architectural processing, then you don't care about the details of the declaration syntax of the architectural meta-DTDs, using them simply as documentation and not as processible data sets. Remember that when defining any sort of SGML application (that is, an application of SGML, not software that processes SGML), the *ONLY* thing that would make it not usable with XML documents would be the requirement that docments use an SGML feature that XML does not provide. As data entities is the only feature of SGML not provided by XML that also has significant *semantic* utility (as opposed to being a syntactic convenience like markup minimization or the LINK feature), you'd have to go out of your way to make your application not XML compatible. And in any case, you can get by without data attributes, although it's clumsy and suboptimal. Cheers, E. --
W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu Nov 12 14:42:55 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:06:28 2004 Subject: Stylesheets considered limiting (was Re: XLink - where are we?) References: <3.0.1.16.19981111210320.302765a0@pop3.demon.co.uk> <3.0.5.32.19981110221631.008ec900@amati.techno.com> <3.0.1.16.19981111021455.6f2f8962@pop3.demon.co.uk> <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> <3.0.1.16.19981112085340.36af5dac@pop3.demon.co.uk> Message-ID: <364AEBA5.C08FED15@technologist.com> Peter Murray-Rust wrote: > > Nope. In Java the task for me is: > - to identify how to map Java classes onto elements. I haven't followed > XSL but I assume it gives little or no help. > - how to create different on-line displays interactively. I'd like to > insert buttons where there are hyperlinks, for example. Is all that in XSL. > >... > > I am making the assumptions that a large number of XML documents will > never have stylesheets. Very few do at the moment. So JUMBO was designed > with the idea that we will have to cope with documents **that do not have > stylesheets**. If you find a way to map elements to Java classes so that Java classes provide behaviour for elements, then you will have *made a stylesheet* and you will have used Java as your stylesheet language. > The current mainstream view - as exemplified by Hybrick - > seems to be that such a document is broken. I don't accept this :-) I'm not sure that I would call HyBrick "mainstream". It is cool, but it doesn't represent the Microsoft/Netscape establishment's view of how the world should work. Anyhow, HyBrick's view is that if HyBrick cannot be taught to process the document (through DSSSL, HyBrick's stylesheet/extension language) then the document is useless for use in HyBrick. Likely the same holds true for Jumbo. All you can do without some kind of functional specification ("stylesheet) is display an undifferentiated tree view (or, with RDF, a more arbitrary graph view). If that's all you want to do, you don't need stylesheets. HyBrick clearly isn't intended to be a tree viewer, however. > Does this imply that it's not XML-compatible at present? Because I suspect > people will get increasingly frustrated with SGML-over-the-wire (e.g. > Hybrick at present) because they don't know how to manage declarations and > catalogs. Although I don't know anything about ISMID, I presume from the sentence below that ISMID doesn't exist yet: > >There is the ISMID (Interchange Standard for Modifiable Interactive > >Documents) that is being developed in SC34 of ISO/IEC JTC1. It's goal is to The current *draft* is not XML friendly, I guess. Most likely there will be no tools for people to get frustrated with until there is a standard and that standard will most likely be XML friendly. Paul Prescod - http://itrc.uwaterloo.ca/~papresco At today's pop doubling rates, in 100 years there will be 20 billion people, more than enough to fill the earth. In 300 years, we will have filled up 16 earth-sized planets (roughly, our solar system). In 2300 years we will have filled up 200 billion earth-sized planets (roughly, our galaxy). Only one technology can save us: birth control. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu Nov 12 14:53:19 1998 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 17:06:28 2004 Subject: Stylesheets considered limiting (was Re: XLink - where are we?) References: <3.0.1.16.19981111210320.302765a0@pop3.demon.co.uk> <3.0.5.32.19981110221631.008ec900@amati.techno.com> <3.0.1.16.19981111021455.6f2f8962@pop3.demon.co.uk> <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> <3.0.1.16.19981112085340.36af5dac@pop3.demon.co.uk> Message-ID: <364AEEFB.7BC122B0@technologist.com> Peter Murray-Rust wrote: > > Nothing wrong with this view. People (even me) want to read paper. But I > certainly feel that almost everything else is seen as lower priority than > the holy grail of high-performance text formatting. And the policy makers > in the W3C community are very much oriented towards paper-like operations. I'm not sure where you get this impression. Let's consider the specs that have been completed since XML: Document Object Model (DOM) Level 1 Synchronized Multimedia Integration Language (SMIL) 1.0 Specification PICS Signed Labels (DSig) 1.0 Specification Cascading Style Sheets, level 2 (CSS2) Specification Mathematical Markup Language (MathML) 1.0 Specification Of these, only one even specifically *addresses* the needs of paper-like operation (CSS). And CSS is for HTML, not XSL. Another is useful for print only as a side effect (MathML). The rest have essentially nothing to do with print, other than through their application to all of XML. > 2 years since the conception of XML and we cannot (with general agreement > on interoperability): > - send a hyperlink over the wire > - send a button over the wire > - since a date over the wire > - send a structured graphic over the wire > > We *can* send a mathematical equation. You forget that using only existing standards, we also cannot send an XML paragraph, or a bolded word in way that can display in two browsers. We can downtranslate to HTML, but you can do that with hyperlinks and buttons also. > This may sound unfair and - yes, I know - there are many things in the > pipeline but IMO the innovation that we saw 18 months ago is disappearing. My impression is the opposite. Two years ago, we were arguing about how to handle whitespace. > Yes, I know that there are XML-aware browsers and client side stylesheets > but the purpose of what Jon was describing is often so they can send it > straight to their local printer. I don't understand your point. Why would Jon go to all of the effort of pushing the XML effort to re-solve problems that he had solved three years ago? I'd be surprised if Sun is even going to bother disrupting existing systems for "XML compliance" in the near future. Nobody in the XML world is trying to recreate SGML and DSSSL. Why would we bother? Paul Prescod - http://itrc.uwaterloo.ca/~papresco At today's pop doubling rates, in 100 years there will be 20 billion people, more than enough to fill the earth. In 300 years, we will have filled up 16 earth-sized planets (roughly, our solar system). In 2300 years we will have filled up 200 billion earth-sized planets (roughly, our galaxy). Only one technology can save us: birth control. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sroth at radsys.com Thu Nov 12 15:00:12 1998 From: sroth at radsys.com (Roth, Scott) Date: Mon Jun 7 17:06:28 2004 Subject: Tools for creating Style Sheets??? Message-ID: <5FAFB2A5D7B2D111ACEA0060972027CE186386@RADSYS_EXCH> Does anyone recommend a tool for creating style sheets or know of any good tutorials to create them?? I am getting ready to learn how to make them I would like to know the best way to accomplish it. Thanks Scott Roth xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dante at mstirling.gsfc.nasa.gov Thu Nov 12 15:05:54 1998 From: dante at mstirling.gsfc.nasa.gov (Dante Lee) Date: Mon Jun 7 17:06:28 2004 Subject: Inputting in image Message-ID: Does anybody know how exactly you input an image into an XML document? Do you use the LOGO tag? ______________________________________________________________ d t l d t l d ttttttttt l dddddd aaaa nnnn t eee l eee eee d d a a n n t e e l e e e e d d aaaaaa n n t eeeee l eeeee eeeee d d a a n n t e l e e ddddddd a a n n t eeeee l eeeee eeeee _______________________________________________________________ Dante M. Lee Code 588 NASA/GSFC Greenbelt MD 20771 Voice = 301-521-1077 Bldg = 23 Rm = W415 Email = dante@mstirling.gsfc.nasa.gov dante4@hotmail.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From laurie.mann at ansys.com Thu Nov 12 15:14:09 1998 From: laurie.mann at ansys.com (Laurie Mann) Date: Mon Jun 7 17:06:28 2004 Subject: Tools for creating Style Sheets??? Message-ID: <7D70697C0E38D111B4FF080036B39A038C72A7@ntdevexc.ansys.com> > From: Roth, Scott [SMTP:sroth@radsys.com] >Does anyone recommend a tool for creating style sheets or know of any good >tutorials to create them?? I looked at a bunch of style-sheet related sites when I was learning how to build them, and found the following site to be absolutely invaluable: http://www.w3.org/Style/CSS/Test/current/index.html These other sites are pretty useful too: http://home2.swipnet.se/~w-20547/faqs/ciwas-mFAQ.html [FAQ v1.70] Welcome to comp.infosystems.www.authoring.stylesheets http://www.w3.org/TR/REC-CSS1 Cascading Style Sheets, level 1 http://wdvl.internet.com/Authoring/Style/Sheets/Tutorial.html The WDVL: Introduction to Style Sheets http://www.w3.org/Style Web Style Sheets http://www.builder.com/Authoring/CSS/table.html CSS reference table http://server.htmlhelp.org/reference/css/stylesheets-now.html Style Sheets Now http://style.webreview.com/glossary.html CSS Resource Guide -- CSS1 Glossary http://www.htmlgoodies.com/ie_style.html#what HTMLGoodies - style http://www.zeldman.com/faq_styles.html Ask Doctor Web About Style Sheets http://webreview.com/97/05/30/feature/tutorial.html What Can You Do With Style Sheets http://www.hwg.org/resources/faqs/cssFAQ.html CSS Frequently Asked Questions Once you get the hang of CSS1, take a look at the future of stylesheets with CSS2: http://www.w3.org/TR/REC-CSS2 Cascading Style Sheets, Level 2 http://www.w3.org/TR/WD-CSS2/page.html CSS2- Paged media xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sroth at radsys.com Thu Nov 12 15:30:23 1998 From: sroth at radsys.com (Roth, Scott) Date: Mon Jun 7 17:06:28 2004 Subject: Good Demos of XML? Message-ID: <5FAFB2A5D7B2D111ACEA0060972027CE186388@RADSYS_EXCH> Ok here a I go with another favor...don't worry folks once I get up to speed on this XML stuff I will contribute more I promise. Now I am wondering if anybody has some good demos for XML. I have seen AgentSoft's XML searcher and I know about Microsoft's Auction and the stuff on XMLu but is there any other one's out there?? thanks Scott Roth xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Nov 12 15:33:55 1998 From: david at megginson.com (david@megginson.com) Date: Mon Jun 7 17:06:28 2004 Subject: SAX Future (was Re: Oracle XML) In-Reply-To: <012c01be0e15$91bae7c0$c5010180@p197> References: <012c01be0e15$91bae7c0$c5010180@p197> Message-ID: <13898.64310.340243.918644@localhost.localdomain> Richard Anderson writes: > For interest, have there been any developments of a 'lower level' > SAX to expose DTD info too ? We've talked about it -- I need to be convinced that there is a big enough demand for that (it would be useful only for very specialised applications). I think that the following for SAX 1.0 are probably higher priorities: 1. A proper specification for SAX (JavaDoc just doesn't cut it). 2. A test suite for SAX conformance. 3. Canonical SAX interfaces for Perl (on top of XML:Parser) and C++. After that, we need to decide if the benefits of a new version of SAX outweight the enormous compatibility problems that will result. If we decide that a new version is worthwhile, here's what I think will be most important: 1. Better control/querying of parser features (validating/non-validating, includes/does not include external parameter/general entities, provides/does not provide locators, etc.). 2. A canonical interface for SAX filters (probably based on Cowan's work). 3. A bit more core library functionality (including constructing an absolute URL from a file name). 4. A new (optional) handler type for lexical information that can optionally be included in the DOM, such as comments and the document type declaration. This will have to track the XML-Infoset work, and parsers would not be required to support it. This would all be convenient, but is it worth outdating all existing SAX implementations just to add it? Parser writers, for example, may have to deal with hundreds more bug reports (each) because of people using the wrong version of the SAX interface. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Nov 12 15:42:14 1998 From: david at megginson.com (david@megginson.com) Date: Mon Jun 7 17:06:28 2004 Subject: Inputting in image In-Reply-To: References: Message-ID: <13899.20.109979.162731@localhost.localdomain> Dante Lee writes: > Does anybody know how exactly you input an image into an XML > document? Do you use the LOGO tag? Sure. Of course, you could also use an IMG element, a BILD element, a FOOBAR element, an HERESAPICTURETHATIWANTTOINCLUDEINMYDOCUMENT element, or anything else. XML is a way of defining document types (or vocabularies, if you prefer), just like IP is a way of exchanging packets and ASCII is a way of representing simple English-language text electronically. You really want to know how to include an image in a *specific* XML document type -- pick a document type and read its user documentation, or invent your own and choose any element type name you want. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Thu Nov 12 15:47:17 1998 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:06:28 2004 Subject: SAX Future (was Re: Oracle XML) Message-ID: <01BE0E5B.6AB4F710@grappa.ito.tu-darmstadt.de> David Megginson wrote: > Richard Anderson writes: > > > For interest, have there been any developments of a 'lower level' > > SAX to expose DTD info too ? > > We've talked about it -- I need to be convinced that there is a big > enough demand for that (it would be useful only for very specialised > applications). Note also that this is one of the problems schema languages were designed to solve. If you have an XML-based schema language, then all that is needed is a parser that parses DTDs and fires SAX events in the schema language and you're done. For most applications, this is an adequate solution. It won't work for all applications because: (a) the schema language might not duplicate all the capabilities of the DTD and (b) the order in which the schema events are fired might not be the same as the order in which the DTD declarations occur. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Thu Nov 12 15:50:04 1998 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:06:28 2004 Subject: Open-source XLink library (was Re: XLink - where are we?) In-Reply-To: <3.0.1.16.19981111205328.370f5328@pop3.demon.co.uk> References: <199811111622.LAA12163@hesketh.com> <3.0.1.16.19981111021455.6f2f8962@pop3.demon.co.uk> <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> Message-ID: <199811121547.KAA01339@hesketh.com> At 08:53 PM 11/11/98 +0000, Peter Murray-Rust wrote: >Many thanks. I haven't had time to look at this but this seems like a >useful way of exploring what XLink can do although it's not an API or a >library. Keep it going. > >I think we shall also need some XLink APIs and/or library routines to help >with some of the processing required... > > P. > Actually, it is is a library, or will be. The demos were one-offs, but the XLinkFilter, LinkSet, and Link classes are definitely library code - and will hopefully be worthy of real use fairly soon. I'm building demos around them, which is definitely clarifying a lot of the functions the library will need. It's fun to take a week away from books... (http://www.simonstl.com/projects/xlinkfilter/) Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer Cookies / Sharing Bandwidth (November) Building XML Applications (December) http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dante at mstirling.gsfc.nasa.gov Thu Nov 12 16:17:16 1998 From: dante at mstirling.gsfc.nasa.gov (Dante Lee) Date: Mon Jun 7 17:06:28 2004 Subject: Dante Lee Message-ID: Do you input the actual image tag into the xsl or the xml document, or is it the HTML document?? ______________________________________________________________ d t l d t l d ttttttttt l dddddd aaaa nnnn t eee l eee eee d d a a n n t e e l e e e e d d aaaaaa n n t eeeee l eeeee eeeee d d a a n n t e l e e ddddddd a a n n t eeeee l eeeee eeeee _______________________________________________________________ Dante M. Lee Code 588 NASA/GSFC Greenbelt MD 20771 Voice = 301-521-1077 Bldg = 23 Rm = W415 Email = dante@mstirling.gsfc.nasa.gov dante4@hotmail.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mda at discerning.com Thu Nov 12 17:40:49 1998 From: mda at discerning.com (Mark D. Anderson) Date: Mon Jun 7 17:06:28 2004 Subject: using xml/xsl/dom together Message-ID: <026101be0e61$f2eae710$0200a8c0@mdaxke.mediacity.com> related to the "where are we" thread, i'm wondering if there is enough in xml/xsl/dom to have a web page where a user can click on their desired appearance, and the page re-arranges, suppresses undesired content, and perhaps changes style? all on the client side? (i mean in theory based on specs). furthermore, could the xml have links inside, and based on the chosen style, the UA would decide to load and embed (or not) the linked-to resources? this capability roughly happens all the time on web sites (user "personalization"), but it isn't interactive on the page, and requires a roundtrip to the server every time to go fetch some pre-formatted html. i've read the 3 specifications, but i'm still at a loss how this (to me) relatively simple application could be built. if there is an example somewhere that shows this, just point me at it. i realize the real-world client-side implementations are quite there yet (does ie5 have enough?), but i'm unclear even how we are *supposed* to tie these things together, particularly when my xml documents will typically be split into several underlying files, none of which will have a specified style sheet. and do i have to wait for action/behavior sheets before there is an easy way to combine my scripting of the DOM with xsl? thanks. -mda xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From laurie.mann at ansys.com Thu Nov 12 17:58:20 1998 From: laurie.mann at ansys.com (Laurie Mann) Date: Mon Jun 7 17:06:29 2004 Subject: using xml/xsl/dom together Message-ID: <7D70697C0E38D111B4FF080036B39A038C72AC@ntdevexc.ansys.com> > From: Mark D. Anderson [SMTP:mda@discerning.com] >related to the "where are we" thread, i'm wondering if there >is enough in xml/xsl/dom to have a web page where a user can >click on their desired appearance, and the page re-arranges, suppresses >undesired content, and perhaps changes style? >all on the client side? (i mean in theory based on specs). >furthermore, could the xml have links inside, and based on the >chosen style, the UA would decide to load and embed (or not) >the linked-to resources? From a tech writer's standpoint, this would be really ideal. We're looking at moving to XML (we're currently in Interleaf), and have produced most of our documents in HTML as well as in hardcopy during the last release. But it would be a GREAT thing to be able to, once the documents are properly tagged in XML, put all docs on a Web site and/CD-ROM, have the user speficy which products and platforms they are using, and have documents "create themselves." I know there are some very expensive XML-based products out there that will let the AUTHOR run scripts to make docs on the fly, but, frankly, it would be even better to have the capability to let the USER customize the final doc/Web site for their own needs. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Nov 12 18:00:04 1998 From: david at megginson.com (david@megginson.com) Date: Mon Jun 7 17:06:29 2004 Subject: Images in XML In-Reply-To: References: Message-ID: <13899.8432.580054.613626@localhost.localdomain> Dante Lee writes: > Do you input the actual image tag into the xsl or the xml document, > or is it the HTML document?? Yes. More seriously, how do you want to do it? You could actually dump a base64-encoded copy of the image right into the XML, you could reference an image externally, or you could add the image using XSL. With XPointer, you can even point to an IMG element in an HTML document. Of course, you'll need software that understands what you're doing, but as far as XML is concerned all of the solutions are valid. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Thu Nov 12 18:05:36 1998 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:06:29 2004 Subject: Back to XObjects was XLink - where a References: > <364AAE1E.84704C93@eng.sun.com> <364AD6E0.E7B489A2@mecomnet.de> Message-ID: <364B211B.A19A5EA4@eng.sun.com> James -- I see you accept the point that the problem isn't that simple. However, there's also the implied one that different folk need different answers. The answers I would have given to those questions are not the ones you gave. Different assumptions can lead to a need for different answers. A base mechanism must be a bit more flexible than that. - Dave james anderson wrote: > > David Brownell wrote: > > > > Graham Moore wrote: > > > > > > Eliot wrote > > > > > > > The last two [java class for elements / swing classes] are no different > > > > from requiring a style sheet--someone still has to provide > > > > a per-document or document type definition of what the styling should > > > > be > > > > > > The functional binding / class => element could be done as a default. Look > > > in the location where the document was acquired from using the element name > > > as the class name. It would just work, with no additional configuration > > > files. If no class exists then the node is just a node. > > > > That doesn't seem sufficient to me. What's the package name? > > a mapping from the namespace name. (java packages may require an intermediate) Assumes there's a namespace name. > > What about classes that should represent multiple element types? > > specialize the class appropriately Assumes creating a class is zero cost, and doesn't accomodate open ended sets of element types. > > What about the different semantics associated with different namespaces? > > that's what packages are for. Assumes folk don't have common semantics in different namepaces. > > Suppose you want the element and class names to be different, perhaps because > > you and your users work with different natural languages? > > support an architectural attribute, with appropriate defaults in the > attribute definitions. Assumes the folk defining the mappings are the ones defining the attributes. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Thu Nov 12 18:32:19 1998 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:06:29 2004 Subject: Back to XObjects was XLink - where a References: > Message-ID: <364B26B1.8A6AC5B@eng.sun.com> Graham Moore wrote: > > > That doesn't seem sufficient to me. ... > > Its completely insufficient. But it is a base level ... Right; many folk have reinvented that level! Nobody, I think, denies that the _model_ is accepted. What I'm saying is that now that's been agreed, it's time to provide a framework for answering questions such as the ones I posed. > The model I have working uses XML config files to specfiy the binding and > allows the specification of more powerful delegation structures than simply > inheriting from the w3c.node. I've got an API driven approach, which could be driven based on config files using an XML syntax like the one I suggested a while back. But the API can also be driven in other ways, too. ("Element Factory", an interface that can be implemented in many ways. I'll post more later.) Some delegation support is needed, I'd agree. Everyone seems to focus on subclassing though (implying a standard DOM implementation, not just interface). I suppose at some level these differ only slightly: MyXObject foo = (MyXObject) node; MyXObject bar = (MyXObject) node.getUserObject (); The main differences I see are that (a) "bar" (delegation) costs more in terms of memory, and (b) "bar" won't naturally have the entire DOM tree as context, without establishing some bidirectional linking policy much like the parent/child one that DOM exposed only partially. Also, (c) there is no real likelihood of eliminating large chunks of the DOM tree without letting the XObjects ("XML Beans") replace implementations of DOM methods. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From xml at globalrobotics.com Thu Nov 12 19:39:10 1998 From: xml at globalrobotics.com (warren) Date: Mon Jun 7 17:06:30 2004 Subject: using xml/xsl/dom together In-Reply-To: <026101be0e61$f2eae710$0200a8c0@mdaxke.mediacity.com> Message-ID: <000c01be0e74$b82fd410$76bbbcc0@cybertron> Hi, [IMHO]... I am a technology consultant for an innovative web solutions company. Currently we are defining a road map to an extensible standards platform (xml java etc). You are correct in the functionality the user requires and from examining the specs we believe that this is definitely possible, however what is currently lacking is a functional product... a browser(scriptable operating system)! We believe that we require something along the lines of... a WYSIWYG xsl editor/displayer, retrieving xml data by xql. the editor [DOM] would then parse & validate and via namespaces allow custom written java components to render or compute the tag/elements. This allows developers to create full component based web applications, page-masters to create full print quality & graphics & multimedia web content, while at the same time allowing a user to customize their stock quotes into an easily understandable presentation. But why wait for MS to implement this into IE5 etc. Simply an application to leverage these standards is all we want... let independent solution developers do the rest.. There are several other areas that we are investigating such as message delivery and interprocess communication. if you are in the same boat as I am, I would like to hear you opinions. regards, Warren. -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of Mark D. Anderson Sent: Friday, November 13, 1998 6:29 AM To: xml-dev@ic.ac.uk Subject: using xml/xsl/dom together related to the "where are we" thread, i'm wondering if there is enough in xml/xsl/dom to have a web page where a user can click on their desired appearance, and the page re-arranges, suppresses undesired content, and perhaps changes style? all on the client side? (i mean in theory based on specs). furthermore, could the xml have links inside, and based on the chosen style, the UA would decide to load and embed (or not) the linked-to resources? this capability roughly happens all the time on web sites (user "personalization"), but it isn't interactive on the page, and requires a roundtrip to the server every time to go fetch some pre-formatted html. i've read the 3 specifications, but i'm still at a loss how this (to me) relatively simple application could be built. if there is an example somewhere that shows this, just point me at it. i realize the real-world client-side implementations are quite there yet (does ie5 have enough?), but i'm unclear even how we are *supposed* to tie these things together, particularly when my xml documents will typically be split into several underlying files, none of which will have a specified style sheet. and do i have to wait for action/behavior sheets before there is an easy way to combine my scripting of the DOM with xsl? thanks. -mda xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From arthur.rother at ovidius.com Thu Nov 12 19:41:33 1998 From: arthur.rother at ovidius.com (Arthur Rother) Date: Mon Jun 7 17:06:30 2004 Subject: Endtag attributes In-Reply-To: <364A665A.EA3B428D@technologist.com> References: <36474321.DE4EA5A3@toolsmiths.se> Message-ID: <3.0.6.32.19981112143944.007aa230@onepine.com> Isn't there a much simpler solution, with which you can avoid end tag attributes with just one pass over the input document. The way to go is maybe using dynamical generated entities. When creating the output stream, generate a header like: %attrdecl; ]> Then, for each element needing an endtag attribute 'attr', do for each starttag encountered Last thing, for each endtag encountered, corresponding to the starttag found, do: (pseudo c command) writeto("attrdecl.ent","\n"); where value is the value, you originally wanted to assign to the endtag attribute. The entity for the attribute name has to be unique ofcourse. This is achieved by using some kind of a tumbler id in the entity name. One could ofcourse also use a running counter. In the next processor, the entities are resolved. -arthur xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Suli.Ding at geis.ge.com Thu Nov 12 19:59:10 1998 From: Suli.Ding at geis.ge.com (Ding, Suli (GEIS)) Date: Mon Jun 7 17:06:30 2004 Subject: Doc2Xml Message-ID: Richard, You are right. The version of the program you have can't do that. I have updated the HTML page at http://www.geocities.com/SiliconValley/Platform/4871/ The zip files are updated also. You have to get a copy of the program to make it work the way you want. Best regards, Suli > ---------- > From: Sargeant, Richard (GEIS) > Sent: Wednesday, November 11, 1998 9:24 AM > To: Ding, Suli (GEIS) > Subject: Doc2Xml > > Hi Suli, > > I have been trying your Doc2Xml program and have a question. How do I > create new tag pairs inside other tag pairs ? > > For Example: > My input file is an Edifact document and the sender & receiver fields > contain qualifiers... > > > UNB+UNOA:1+RICHARD:ZZ+SARGEANT:ZZ+971029:0847+REF00032++*ANY'UNH+1+ORDERS. > ..etc > > If I run this via your example table the results looks like... > > > > RICHARD > ZZ > SARGEANT > ZZ > REF00032 > > *ANY > > However, what I would like to output is... > > > > > RICHARD > ZZ > > > SARGEANT > ZZ > > REF00032 > > *ANY > > I have tried various combinations without success. Is this possible ? > > regards > Richard Sargeant > > e GE Information Services__________________ > Prof.E.M.Meijerslaan 1 > 1183 AV Amstelveen > Netherlands > > Tel: +31 20 503 5576 Dialcomm: 381 5576 > Fax: +31 20 640 3825 Dialcomm: 550 3825 > Mobile: +31 653 930239 > internal web site: http://luggage.is.ge.com > __________________________________________ > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Thu Nov 12 20:02:19 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:30 2004 Subject: Open-source licensing (was: XLink library) References: <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> <199811111622.LAA12163@hesketh.com> Message-ID: <364B3EC6.2CCEACA5@locke.ccil.org> Simon St.Laurent wrote: > The library is and will be open source, though at this point I haven't > firmly decided on a license. (GPL is likely, in some slightly modified > form.) I suggest that you avoid the GPL for components, as GPLed components can only be used to build GPLed applications. Alternatives are: the LGPL at http://www.gnu.ai.mit.edu/copyleft/lgpl.html; the MPL at http://www.mozilla.org/NPL/MPL-1.0.html; the BSD license (aka "James Clark" license) at http://www.opensource.org/bsd-license.html the public domain. I favor the last two options for large and small components respectively. For information and texts, see http://www.opensource.org . Also note that modified versions of the GPL are *forbidden* by its own copyright notice: the GPL is not itself under the GPL, but under a license that says "anyone may copy, but changes are forbidden." That is to prevent the proliferation of minutely different versions of the GPL. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Thu Nov 12 20:06:34 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:30 2004 Subject: ID attribute defaults References: <3.0.32.19981111091752.00b4fdf0@pop.intergate.bc.ca> <3.0.5.32.19981111204941.009bc9d0@amati.techno.com> Message-ID: <364B3FD6.42DFE866@locke.ccil.org> W. Eliot Kimber wrote: > It's my personal feeling that the inability to provide defaults for ID, > IDREF(S), and ENTITY(IES) attributes is a bug (in SGML unavoidably > inherited by XML). There is no evidence for the presence of such a constraint in XML; the relevant clause (3.3.2) is silent. Clause 3.3.1 does require that ID attributes have a default of either #IMPLIED or #REQUIRED, but no restrictions are placed on IDREF, IDREFS, ENTITY, or ENTITIES, so they may have default values in XML. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Nov 12 20:16:55 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:06:30 2004 Subject: Open-source licensing (was: XLink library) Message-ID: <3.0.32.19981112121149.0205ac80@pop.intergate.bc.ca> At 03:02 PM 11/12/98 -0500, John Cowan wrote: >I suggest that you avoid the GPL for components, as GPLed components >can only be used to build GPLed applications. Alternatives are: ... > the MPL at http://www.mozilla.org/NPL/MPL-1.0.html; ... The Mozilla guys put an *immense* amount of work into this, and got input from all the open source icons (Linus, Larry, Eric, etc). James seems happy with it too. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Thu Nov 12 20:16:55 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:30 2004 Subject: Full Disclosure: What The XML Processor Must Tell The XML Application References: <3641FCE4.61D46D15@locke.ccil.org> <364A6634.4E060F06@technologist.com> Message-ID: <364B429F.2BA10592@locke.ccil.org> Paul Prescod scripsit: > I think David agrees with me that XML processors do not have enough > information to make the decision about whether entity references should be > expanded or not. What about an XML processor that (as a part of its design) never expands external entity references? For example, one might want to have an ultrafast XML processor that works only from a string input source and can therefore operate in embedded environments. It cannot have a SAX interface as SAX is currently defined without violating XML 1.0. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mda at discerning.com Thu Nov 12 20:44:28 1998 From: mda at discerning.com (Mark D. Anderson) Date: Mon Jun 7 17:06:30 2004 Subject: using xml/xsl/dom together Message-ID: <035f01be0e7c$099229a0$0200a8c0@mdaxke.mediacity.com> > But why wait for MS to implement this into IE5 etc. Simply an application >to leverage these standards is all we want... let independent solution >developers do the rest.. I don't want to wait, i want to start programming towards what is going to be available on the client. By "client", i mean one that will have implemented the emergent standards, whether it be IE, mozilla, or opera. I'm *not* interested in something requiring some other ISV's technology on the client-side. There may be some space for the numerous "intranet-in-a-box" startups that stick java or activeX in the browser to compensate for lacking standards or standards-compliance. But i want my architecture to have broad reach, and that means that on the client i only want to rely on the capabilities implicit in a fully standard-compliant browser, whenever that happens. My problem is that i can't see how that it is even *supposed* to work, even if there were a standards-compliant browser. The conceptual model in the xsl/xml specs seems to be that there is one xml document and it specifies a style sheet embedded in it. That is pretty far off from what i have in mind. similarly, i don't quite see how xsl/dom/action-or-behavior-sheets all come together. -mda xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Thu Nov 12 21:22:06 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:30 2004 Subject: Stylesheets considered limiting (was Re: XLink - where are we?) References: <3.0.5.32.19981111210459.009ba510@amati.techno.com> <3.0.1.16.19981111210320.302765a0@pop3.demon.co.uk> <3.0.5.32.19981110221631.008ec900@amati.techno.com> <3.0.1.16.19981111021455.6f2f8962@pop3.demon.co.uk> <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> <3.0.5.32.19981112075504.008d5b10@amati.techno.com> Message-ID: <364B51E5.E83031DC@locke.ccil.org> W. Eliot Kimber scripsit: > [D]ata > entities is the only feature of SGML not provided by XML that also has > significant *semantic* utility (as opposed to being a syntactic convenience > like markup minimization or the LINK feature) [...] And can you enlighten those of us who don't know SGML about the semantics of data entities? Thanks. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Thu Nov 12 21:44:49 1998 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:06:31 2004 Subject: Back to XObjects was XLink - where a Message-ID: <002d01be0e86$0fb550a0$ab026982@thing1.camb.opengroup.org> David and all, > MyXObject foo = (MyXObject) node; > MyXObject bar = (MyXObject) node.getUserObject (); Wrapping a user object with a DOM element is a liberating concept, especially as it lets you use things like Swing components for the base class, the DOM tree becoming, in effect, a bean box. There are then two alternatives: 1. Exposing the wrapper element to the user object. Handy for some things where the user object needs to navigate the DOM tree, but quite restrictive then on the choice of possible user objects. 2. Providing the data and other user object interconnections. This is where meta data has its real strength--there is simply too much code to write with an api. Paul Rabin and I have been working on an expansion of the bindings document to encompass the meta data needed to provide both the data mapping and support arbitrary connections (add/North, addTag/"View", addListener/java.awt.ActionListener, etc.). One of the nice things to fall out is BindingSets. A BindingSet object supports the DomBuilder interface, but has an associated set of Bindings documents to support various types of documents. There is closure, so that all the Bindings of a set belong only to that one set. (With a minor exception for subsets.) The "big deal" about binding sets is that it is that you can now determine which set of bindings to use when one document references another--by default, you just look at the bindings that were used with the current document and go with the associated bindings set. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Denis_Haskin at iacnet.com Thu Nov 12 22:06:01 1998 From: Denis_Haskin at iacnet.com (Denis_Haskin@iacnet.com) Date: Mon Jun 7 17:06:31 2004 Subject: XSchema/DCD to DTD? Message-ID: <852566BA.0078A5E6.00@circus.med.iacnet.com> (I was sorely tempted to cross-post to XSL-List, but given that I (hope) XML-Dev is a superset of XSL-List, I will refrain for the time being...) I could have sworn I saw a recent post about an XSchema-to-DTD converter, but I am unable to find any mention of such a thing in either the XML-Dev or XSL-List archives. Ideally, what I am really looking for is an XSL stylesheet to transform a DCD into a DTD, and XSchema (the latter is optional). I can put one together myself, but I suspect I'm not the first person to want to do this. (Why? Well, I would prefer to keep my document in definition in DCD (XSchema would be okay, too, but I just happened to start with DCD first) but it needs to be consumed by people who are more comfortable with DTDs.) Thanks for any pointers, dwh xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Denis_Haskin at iacnet.com Thu Nov 12 22:23:17 1998 From: Denis_Haskin at iacnet.com (Denis_Haskin@iacnet.com) Date: Mon Jun 7 17:06:31 2004 Subject: XSchema/DCD to DTD? Message-ID: <852566BA.007A3E6E.00@circus.med.iacnet.com> Ignore my recent question ("an XSL stylesheet to transform XSchema/DCD to DTD"). Re-reading things a bit closer in the XSL spec, it's not clear to me it can be done with XSL. -sigh- Thanks, dwh xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Nov 12 22:34:38 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:06:31 2004 Subject: Oracle XML References: <364AB030.EFAC2ADE@sophia.inria.fr> Message-ID: <364B6296.3598F440@infinet.com> "Philippe Le H?garet" wrote: > K. Travis Walsh wrote: > > > > Oracle officially anounced today it's XML strategy. > > > > For those of you who didn't know there is a java parser built into this > > new release of Oracle 8i. > > Like netscape, They can't write well-formed XML document. > > is not a well-formed XML document. > Hey give them a break. Forgetting that little question mark is something everyone has done from time to time when handcrafting XML documents. I know, cause I have done this many times. If XML did not inherit SGML's syntax, I bet we would all of a much cleaner syntax to work with with less user errors as a result. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ralph at fsc.fujitsu.com Thu Nov 12 22:47:51 1998 From: ralph at fsc.fujitsu.com (Ralph Ferris) Date: Mon Jun 7 17:06:31 2004 Subject: Defining Behavior Was: Back to XObjects was XLink In-Reply-To: <00c201be0e21$af6209a0$1b11e391@mhklaptop.bra01.icl.co.uk> Message-ID: <3.0.5.32.19981113054340.009f7b80@pophost.fsc.fujitsu.com> At 09:49 AM 11/12/98 -0000, Mike Kay wrote: > >And this reflects a lingering concern I have about XLink, >which is that it is putting display-time behaviour into the >XML document rather than into the stylesheet. I'm not >convinced that XLink is defining relationships at a high >enough level of abstraction, and I'd prefer to see work on >interactive behaviour happen in the XSL world. A long standing issue, still under discussion. In HyBrick, behavior depends on a *combination* of DTD and stylesheet information. This follows from the use of DSSSL-online, which defines "make link" and "make external graphic" procedures. In the original demo version of HyBrick (which was not publically distributed) the "make link" procedure was used to create the equivalent of HTML A-type elements. In the current release, it has been adapted for the locators in xml:link-type elements, except that the HTTP sub-system isn't currently enabled for link-type elements, so links only work with the local file system. The "make external graphic" procedure was (and is) used to create the equivalent of HTML IMG-type elements. The element types themselves of course have to be declared in the DTD, and their attributes assigned the proper declared values. Note that the "make link" and "make external graphic" procedures also require that the attribute values be declared. Of course, the attribute declared values in the DTD and stylesheet *should* match. The several people I've talked to about this have all stated that enforcing this match is an "application" level issue. Another factor to be aware of though is that the "document groups" defined in the current XLink draft affect link "visibility" - which can have a subtle - and confusing - influence on behavior. I would call this another aspect to the "high enough level of abstraction" that you referred to. Current HTML users are used to a "go to" model of linking. Hypertext experts have always known that we can do better. The issue is, pre-Web hypertext systems where built, whatever the ultimate aspirations of their designers, as *closed* systems. Even in a closed system, applying even what's currently defined in the XLink/XPointer specs can get interesting. Applying these concepts to the open environment of the Web is going to get even more interesting. That's why it's important for more people to start experimenting with HyBrick, so they can get more insight into these issues first hand. > >In particular, I am really uncomfortable with the one-to-one >mapping of stored XML documents to "units of display". I >think the presentation facilities (stylesheets and hyperlink >browsing) should be independent of the granularity of >storage. > >To put things another way, if I'm going to have to >pre-process my corpus by splitting it into lots of linked >page-sized chunks to make it browsable, I might as well >render those chunks in HTML while I'm about it. Agreed. In particular, with XPointer you should be able to retrieve "fragments" rather than having to retrieve entire documents. This day will come. Best regards, Ralph E. Ferris Fujitsu Software Corporation xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From WalterS at ile.com Fri Nov 13 00:27:39 1998 From: WalterS at ile.com (Walter Smith) Date: Mon Jun 7 17:06:31 2004 Subject: XML and Internationalization... Message-ID: <19B8452E57A0D111B04100805F6F312C0198BC73@email.ile.com> > -----Original Message----- > From: Deke Smith [mailto:deke@tallent.com] > Sent: Monday, November 09, 1998 11:45 AM > To: xml-dev@ic.ac.uk > Subject: Re: XML and Internationalization... > > > > Here's my question: > > As I understand it, TMX is a format for translation > "dictionaries" -- or lists of equivalent words, phrases, > sentences or paragraphs in different languages. TMX also > allows the preservation of formating within phrases, such as > boldface, italic, etc. > > I always judge tools by what *I* need from them and that is > what I need from TMX. Is it meant to do more than what I have > asked it to do? Is this "dictionary" concept something TMX is > *meant* for? > > I am under the impression that TMX can also have embedded > "macros" within phrases. By "macro", I mean processing > commands that may be understood only by a specific scripting > language. Am I right? We served as technical chair of the group of localization companies that participated in the creation of the TMX format. As Tony said in an earlier email, it was conceived as a Translation Memory Exchange format. In the translation/localization biz, we use these translation memory tools (really nothing more than bi-text databases) to capture prior translation effort and reuse it where ever applicable. Most of the data/file types that localization companies traditionally encounter is _not_ native SGML/XML. As such, vendors are left to their own devices to decide how to process the plethora of proprietary formats (resource files, DTP files, etc.) to efficiently access the embedded translatable text. Needless to say, everyone has come up with very different ways of doing this. The TMX format simply seeks to provide a pragmatic way of exchanging TM data among disparate environments, and really nothing more. Another translation-related tag set is the OpenTag format (http://www.opentag.org), which we launched over a year ago and have been collaborating on with others in the localization industry. It specifically seeks to provide a common method to markup textual data that is extracted from functional/presentational/structural markup for the purposes of language translation or any sort of NLP activity. The OpenTag schema was specifically designed to abstract source differences at the element level, while disambiguating context issues at the attribute level. Again, the data types we're mainly dealing with is primarily everything but MLs. It's flexible enough just about any data you can parse and extract from your source environment, and there's even elements and attributes that be employed to induce additional information into the data. While it wasn't originally conceived to be a tag set for data creation, you may find the flexibility you're looking for. You would then get the added bonus that your data would already be ready for processing by any OpenTag-aware translation tools and environments. Cheers, Walter ---------------------------------------------- Walter L. Smith (walters@ile.com) Emerging Technology Analyst International Language Engineering Corporation 5700 Flatiron Parkway Boulder, CO 80301 303-245-7584 (vox) 303-596-7343 (cel) 303-245-7973 (fax) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at dns.isogen.com Fri Nov 13 05:50:19 1998 From: eliot at dns.isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:06:31 2004 Subject: Stylesheets considered limiting (was Re: XLink - where are we?) In-Reply-To: <364B51E5.E83031DC@locke.ccil.org> References: <3.0.5.32.19981111210459.009ba510@amati.techno.com> <3.0.1.16.19981111210320.302765a0@pop3.demon.co.uk> <3.0.5.32.19981110221631.008ec900@amati.techno.com> <3.0.1.16.19981111021455.6f2f8962@pop3.demon.co.uk> <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> <3.0.5.32.19981112075504.008d5b10@amati.techno.com> Message-ID: <3.0.5.32.19981112234611.0098ecd0@amati.techno.com> At 04:23 PM 11/12/98 -0500, John Cowan wrote: >W. Eliot Kimber scripsit: > >> [D]ata >> entities is the only feature of SGML not provided by XML that also has >> significant *semantic* utility (as opposed to being a syntactic convenience >> like markup minimization or the LINK feature) [...] > >And can you enlighten those of us who don't know SGML about >the semantics of data entities? Thanks. Sure. SGML and XML both have the concept of data content notations, declared with the NOTATION declaration: Notations allow you to give names to specific data types, that is, the rules that govern the interpretation and processing of a particular type of data. XML is an example of a data content notation, GIF is another. For example, to refer to another XML document as a data entity, I would do this: This entity "somedoc" is an unparsed entity in XML terms, meaning that it is not parsed in the context of the document that references it (even though it will of course be parsed as an independent document should it ever be processed). In SGML, you can define attributes for notations just as you can for element types ("data attributes"). These attributes act as parameters to the processors that know how to interpret data governed by the notations. In SGML, you can specify data attributes either as part of a data entity ("unparsed entity") declaration or on an element that is also governed by a notation. With the WebSGML TC to SGML, you can also use data attributes as part of the declaration of attributes in order to define specific data types for attributes. One use of data attributes is to associate attributes with data entities. The textbook use is parameters associated with graphic data entities, e.g.: A typical SGML processor will, when it encounters a reference to the entity big-graphic, will see if it knows of a processor that can process the notation "gif". It finds one and passes it the information from the entity declaration, including the data attributes, which the processor presumably takes as parameters. The presumption is that the notation has defined, as part of its formal documentation or definition, what the attributes should be. I.e., somewhere in the definition of this fictional gif notation (or the local processor associated with the gif notation), the documentation says something like "processors should accept height and width parameters from which they determine the presentation size of the graphic". Data attributes can be used with elements that are "governed" by a notation. For example, say I want to define a query notation that I'll use to address things in some repository. I first define a notation that represents the general query mechanism: The resource identified by the external identifier should be the authoritative definition of what the notation is about, how to process it, etc. *IT SHOULD NOT BE A PROGRAM*. Programs should be associated with notations by mapping their external identifiers to programs, dlls, tool-provided functions, etc., through some tool-specific mechanism. For example, in PHyLIS, my HyTime engine, my intent is to provide a configuration file that lets you map notations to dynamic libraries or objects (e.g., COM objects, Java classes, Corba whatsits, etc.). To make the query easy for authors to specify, I want to provide a few parameters that authors fill in to specify the query details. I do this by declaring some notation attributes to serve as the parameters to the query: I then declare an element type that can be used to specify queries in documents: Note the NOTATION attribute. This attribute defines the SelectData element as being "governed by" the notation MyQuery. This means that *after* the document is parsed, the processor will process this element type by looking up a processor associated with the notation "http://www.drmacro.com/notations/myquery.xml" (remembering that the external ID is the real name, the local name, MyQuery, is just a local proxy). It finds one, so it passes the whole SelectData element to it (that is, the element node constructed from the SelectData markup). The element's attributes are associated with the notation attributes simply by matching the names [the Data Attributes for Elements (DAFE) facility of the HyTime architecture provides machinery for resolving name conflicts, but you wouldn't normally need this extra stuff because you can just define your elements and notations so as to avoid the problem.] The processor knows to expect to find the three attributes declared for the notation (because the notation serves the processor and thus reflects what it's defined as needing). It uses normal element processing (e.g., element.Attributes["table"]) to get the value of the attributes, does what it does, and returns the result. The processor takes the result and goes on doing whatever it's doing. One way to think about data attributes is that they provide the binding between software components in the outside world and data objects in the document, either data entities or elements. Essentially, the processor for some data type defines its interface, in the API sense, and the notation attributes "implement" that interface by providing the necessary attributes. Of course, it's up to particular processing systems to provide the actual integration APIs and configuration infrastructure so that processors and documents can take advantage of notations and data attributes, but it's pretty basic stuff. I think that data attributes, especially when used with elements governed by notations, are really really valuable. I pushed very hard for their inclusion in XML but was unable to convince the rest of the WG. I hope that the XML developers will reconsider data attributes when XML is eventually revised. I think they're very valuable. Cheers, E. --
W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From M.H.Kay at eng.icl.co.uk Fri Nov 13 10:18:08 1998 From: M.H.Kay at eng.icl.co.uk (Michael Kay) Date: Mon Jun 7 17:06:31 2004 Subject: SAX Future (was Re: Oracle XML) Message-ID: <004001be0eee$466ce6e0$7008e391@bra01wmhkay.bra01.icl.co.uk> -----Original Message----- From: david@megginson.com To: XML Developers' List Date: 12 November 1998 15:35 Subject: SAX Future (was Re: Oracle XML) >Richard Anderson writes: > > > For interest, have there been any developments of a 'lower level' > > SAX to expose DTD info too ? > >We've talked about it -- I need to be convinced that there is a big >enough demand for that (it would be useful only for very specialised >applications). I think that the following for SAX 1.0 are probably >higher priorities: > My view: - I think it is time to start talking about SAX 2.0 - I think David's list of requirements captures the main points - I think we should get DTD info in while we're about it, though it can be an optional feature - I'd add two more features: (a) an option to ask the parser not to break character data except at element boundaries. This is because it's so easy currently to write applications that are wrong because they rely on a parser behaviour that the SAX interface doesn't guarantee. (b) a "parser manager" to provide control of which parser is used. (This of course is a bit of software not just an interface! - I'll offer the parser manager in SAXON as a starter; it should be extended to allow a pile of parser filters to be configured) I think we can manage compatibility as follows: - We design 2.0 so that an application that conforms to SAX 1.0 also conforms to SAX 2.0 - A standard wrapper round a SAX 1.0 parser should enable it to conform to SAX 2.0, providing null/default implementations of the new features where necessary. (E.g. the response to the question "are you a validating parser?" is "maybe"). Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From daveh at persimmon.co.uk Fri Nov 13 10:26:17 1998 From: daveh at persimmon.co.uk (Dave Halls) Date: Mon Jun 7 17:06:31 2004 Subject: using xml/xsl/dom together References: <026101be0e61$f2eae710$0200a8c0@mdaxke.mediacity.com> Message-ID: <364C0931.556A30EB@persimmon.co.uk> Mark D. Anderson wrote: > related to the "where are we" thread, i'm wondering if there > is enough in xml/xsl/dom to have a web page where a user can > click on their desired appearance, and the page re-arranges, suppresses > undesired content, and perhaps changes style? > all on the client side? (i mean in theory based on specs). > > furthermore, could the xml have links inside, and based on the > chosen style, the UA would decide to load and embed (or not) > the linked-to resources? > > this capability roughly happens all the time on web sites > (user "personalization"), but it isn't interactive on the page, > and requires a roundtrip to the server every time to go fetch > some pre-formatted html. We put a demo up a year ago that does this on the client side using XMLJ, our Standard ML-based XML processing library. This runs on the JVM in the browser. Uses SAX or DOM-built trees. See http://research.persimmon.co.uk/mlj/demos/stylesheets/index.html and http://research.persimmon.co.uk/mlj/demos/stylesheets/about.html for some blurb. Requires Netscape 4 or MSIE 4 (not Mac). I've recently ported the combinator parser to Java for the non-functional programmers. When I get shot of the short-term things I'm doing (I can't believe it's a year since I did the demo), I'll put something up to read/download. We've switched now to server-side XML processing, I find it more interesting (my opinion only of course) and there's loads to do. Also no hassle with browsers, they really are the most shoddy pieces of software ever written. Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Nov 13 12:01:54 1998 From: david at megginson.com (david@megginson.com) Date: Mon Jun 7 17:06:31 2004 Subject: SAX Future In-Reply-To: <004001be0eee$466ce6e0$7008e391@bra01wmhkay.bra01.icl.co.uk> References: <004001be0eee$466ce6e0$7008e391@bra01wmhkay.bra01.icl.co.uk> Message-ID: <13900.7613.719789.40250@localhost.localdomain> Michael Kay writes: > - I'd add two more features: (a) an option to ask the parser not to > break character data except at element boundaries. This is much easier to handle with a SAX filter that implements both Parser and DocumentHandler and buffers the character data. As a matter of fact, I'd like to formalise the idea of a SAX filter in SAX 1.0.1 (I think I'd like the first two digits to correspond to the XML version). In fact, a string of filters (Chain of Responsibility in Design-Pattern-speak) can give users an awful lot of control for very little work. Other interesting filters would automatically include XML documents embedded with XLink into the parent document's event stream, resolve namespaces (John Cowan has already done this), normalise whitespace using xml:space, selectively filter out elements, etc. All of this can be done above the parser/driver level, so there's no need to complicate SAX by including it in the core spec itself (except for including a canonical Filter interface). > (b) a "parser manager" to provide control of which parser is > used. (This of course is a bit of software not just an interface! - > I'll offer the parser manager in SAXON as a starter; it should be > extended to allow a pile of parser filters to be configured) Agreed, though the design may have to vary dramatically from language to language. > I think we can manage compatibility as follows: > - We design 2.0 so that an application that conforms to SAX 1.0 also > conforms to SAX 2.0 > - A standard wrapper round a SAX 1.0 parser should enable it to conform to > SAX 2.0, providing null/default implementations of the new features where > necessary. (E.g. the response to the question "are you a validating parser?" > is "maybe"). Yes, but how do we accomplish this? Do we invent a new package name for SAX 1.0.1 to avoid collision? All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From j.j.wester at kpn-telecom.nl Fri Nov 13 13:33:45 1998 From: j.j.wester at kpn-telecom.nl (Wester, JJ (ICT)) Date: Mon Jun 7 17:06:32 2004 Subject: Creation of XML documents Message-ID: <199811131331.OAA19472@sat-relay2.pc.telecom.ptt.nl> I am involved in a similar project: we want to replace the current inter-application communication techniques (RPC's, batch files etc) by XML document exchanges. This because XML makes the interfaces more flexible. We need methods and tools for the programmers who have to adapt old applications, or build new applications that communicate through XML to other applications. The problem is that these programmers are used to RPC's etc. If they have to use DOM, that would probably mean a lot of education and a lot of changes in current applications. Also a lot of programs are written in COBOL, which makes it more difficult to use DOM I think. The question is: what methods and tools can you offer these programmers? What is feasible? DOM? Which implies that the programmer constructs and sends an XML document instead of calling a remote procedure. DCE IDL? The programmer uses RPC's (as he does now), which are translated by underlying middleware to XML documents? But is this a practical solution? What about using UML for your data definitions and then create stubs? Any ideas? ---------------------------------------------------------------------------- -------------------------------- Johan Wester Tel. +31 50 585 16 01 KPN Telecom Fax. +31 50 585 22 40 P.O. Box 958 j.j.wester@kpn-telecom.nl 9700 AZ Groningen > -----Original Message----- > From: Kurt Helenelund [SMTP:kurt@simberg.com] > Sent: donderdag 5 november 1998 14:03 > To: xml-dev@ic.ac.uk > Subject: Creation of XML documents > > I am working on a project where we will use XML to exchange information > between > applications in different government agenices. We want to implement both > on-line access > between applications and asynchronous store & forward type of > mechanisms. > > I understand that there are 'lots' of good XML parsers (we have tried > some) out there and that SAX and DOM are > the prefered ways for applications to 'read' XML structures. I would > like to ask if there's anyone > that have the opposite problem i.e. for applications to create XML > documents on-the-fly. Of course > the developer could 'hand code' the XML structures which is error prone > and booring . I am looking > for something (API, lib) so that we could avoid this. > > I would like to have a 'library' to which the application developer > could say 'using this DTD please > instantiate a XML document and help me to fill it in'. > > Any solutions? > > > -- > _______________________________________________________________________ > Kurt Helenelund Mobile: +358 50 555 0192 > Simberg & Partners Home: +358 9 294 0313 > Mielikintie 7B Fax: +358 9 294 0314 > FIN-04230 KERAVA, Finland Email: kurt@simberg.com > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at dns.isogen.com Fri Nov 13 14:27:02 1998 From: eliot at dns.isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:06:32 2004 Subject: Stylesheets considered limiting (was Re: XLink - where are we?) In-Reply-To: <3.0.5.32.19981112234611.0098ecd0@amati.techno.com> References: <364B51E5.E83031DC@locke.ccil.org> <3.0.5.32.19981111210459.009ba510@amati.techno.com> <3.0.1.16.19981111210320.302765a0@pop3.demon.co.uk> <3.0.5.32.19981110221631.008ec900@amati.techno.com> <3.0.1.16.19981111021455.6f2f8962@pop3.demon.co.uk> <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> <3.0.5.32.19981112075504.008d5b10@amati.techno.com> Message-ID: <3.0.5.32.19981113082146.009bd5e0@amati.techno.com> At 11:46 PM 11/12/98 -0600, W. Eliot Kimber wrote: >At 04:23 PM 11/12/98 -0500, John Cowan wrote: >>W. Eliot Kimber scripsit: >> >>> [D]ata >>> entities is the only feature of SGML not provided by XML that also has >>> significant *semantic* utility (as opposed to being a syntactic convenience >>> like markup minimization or the LINK feature) [...] >> >>And can you enlighten those of us who don't know SGML about >>the semantics of data entities? Thanks. I meant to write "data attributes is the only feature of SGML..." and didn't notice the mistake even when the question above was asked. I answered the question I thought was being asked (that is, the question that would have been asked had I not mistyped). Of course, XML has data entities, it just calls them "unparsed entities". In SGML, data entities may have data attributes, in XML they may not. My appologies for any confusion this may have caused. Cheers, E. --
W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From RJA at arpsolutions.demon.co.uk Fri Nov 13 15:02:00 1998 From: RJA at arpsolutions.demon.co.uk (Richard Anderson) Date: Mon Jun 7 17:06:32 2004 Subject: Creation of XML documents Message-ID: <000601be0f16$a5fbeac0$c5010180@p197> > I would like to have a 'library' to which the application developer > could say 'using this DTD please > instantiate a XML document and help me to fill it in'. I've got a C++ toolkit that is under development that sort of mets these requirements. It has a SAX+DOM interface. ( The COM variant will follow very shortly for use in VB etc ) Using the SAX interface you can build a 'template' DOM and then fill in the missing bits. The toolkit has *no* dependancies on MSIE etc. Heres some sample code for creating an XML EMAIL: ************ CODE SECTION START **************** pDoc = pDOMAPI->createDocument(); pRootElement = pDoc->createElement( L"EMAIL" ); pFrom = pDoc->createElement( L"From" ); pRootElement->appendChild( pFrom ); pFrom->setAttribute( L"Priority", L"High" ); pFrom->setAttribute( L"DeliveryReceipt", L"Yes" ); pFrom->setAttribute( L"ReturnReceipt", L"Yes" ); pText = pDoc->createTextNode( L"RJA@arpsolutions.demon.co.uk" ); pFrom->appendChild( pText ); pTo = pDoc->createElement( L"To" ); pRootElement->appendChild( pTo ); pText = pDoc->createTextNode( L"enquires@arpsolutions.demon.co.uk" ); pTo->appendChild( pText ); pSubject = pDoc->createElement( L"Subject" ); pRootElement->appendChild( pSubject ); pText = pDoc->createTextNode( L"XML/DOM/SAX C++ Toolkit" ); pSubject->appendChild( pText ); pComment = pDoc->createComment(L"Main body of Email follows"); pRootElement->appendChild( pComment ); pBody = pDoc->createElement( L"Body" ); pRootElement->appendChild( pBody ); pText = pDoc->createTextNode( L"Seems OK so far." ); pBody->appendChild( pText ); pText = pDoc->createTextNode( L"I'll have to try harder to break it." ); pBody->appendChild( pText ); pPI = pDoc->createProcessingInstruction(L"PI", L"That's all folks!" ); pCC = pDoc->createElement( L"CC" ); pText = pDoc->createTextNode( L"xml_toolkit@arpsolutions.demon.co.uk" ); pCC->insertBefore( pText, NULL ); pRootElement->insertBefore( pCC,pTo ); CStdioWideStream stream; pDOMAPI->writeXML( pRootElement, &stream ); *********** CODE SECTION END *************** If your interested I can send you the C++ alpha toolkit. Regards, Richard. *********************************************** * E-Mail mailto:RJA@arpsolutions.demon.co.uk * * WEB http://www.arpsolutions.demon.co.uk * *********************************************** xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Nov 13 16:17:26 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:32 2004 Subject: SAX Future References: <004001be0eee$466ce6e0$7008e391@bra01wmhkay.bra01.icl.co.uk> <13900.7613.719789.40250@localhost.localdomain> Message-ID: <364C5BCD.7D16DD34@locke.ccil.org> david@megginson.com wrote: > This is much easier to handle with a SAX filter that implements both > Parser and DocumentHandler and buffers the character data. Which someone using my ParserFilter class can write in < 1 day. > In fact, a string of filters (Chain of Responsibility in > Design-Pattern-speak) can give users an awful lot of control for very > little work. You betcha. And because each filter component can specified by a Java system property, chaining can be done at run time very neatly. The highest level filter is specified as org.xml.sax.parser, and each lower level is specified by .parser, thus: -D org.xml.sax.parser=com.domain.chance -D com.domain.chance.parser=com.domain.evers -D com.domain.evers.parser=com.domain.tinker -D com.domain.tinker.parser=com.domain.SAXDriver sets up a chain where SAXDriver (the real parser) passes to tinker, which passes to evers, which passes to chance, :-) which passes to the application. Of course all this can be set up programmatically too. > All of this can be done above the parser/driver level, so > there's no need to complicate SAX by including it in the core spec > itself (except for including a canonical Filter interface). Absolutely. > Yes, but how do we accomplish this? Do we invent a new package name > for SAX 1.0.1 to avoid collision? I think that would be better than new class names. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Nov 13 16:19:22 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:32 2004 Subject: Walking the DOM (was: XML APIs) References: Message-ID: <364C5C6F.971FD06B@locke.ccil.org> Miles Sabin wrote: > OK, I agree that this check is O(1). However, that's only > because the granularity of the check is so coarse: a single > document-level timestamp will cause a lot of unnecessary > invalidation ... Agreed. But this isn't so important if there are few active iterators (= iterators that will ever be resumed), which was my point. > I suspect that it would make the performance > of modifying a document via iterators unacceptably poor. Here I think the JDK 1.2 java.util.Iterator class is useful: it has a "remove" method which removes the last element iterated to in a safe way, or raises an exception if the underlying container is read-only. > Moving to per-node timestamps would reduce the amount of > unnecessary invalidation, and preserve the O(1) check, > but at the cost of making tree modifications O(log n). Just so. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Nov 13 17:18:05 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:06:32 2004 Subject: SAX Future Message-ID: <3.0.32.19981113090224.00bc2d00@pop.intergate.bc.ca> At 11:18 AM 11/13/98 -0500, John Cowan wrote: >sets up a chain where SAXDriver (the real parser) passes >to tinker, which passes to evers, which passes to chance, :-) which >passes to the application. Note to the other 83% of the readership: this is an erudite historical-baseball allusion. We await with bated breath the Roy Riegels metaphors. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Fri Nov 13 17:26:34 1998 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:06:32 2004 Subject: SAX Future In-Reply-To: <13898.64310.340243.918644@localhost.localdomain> References: <012c01be0e15$91bae7c0$c5010180@p197> <13898.64310.340243.918644@localhost.localdomain> Message-ID: I think the main reasons that SAX has achieved such popularity as it has (even being implemented by Oracle) are that it is very simple to add SAX support to a parser, it arrived early and it has remained stable. Any attempts at making a SAX 2.0 (or 1.0.1) puts two of these reasons in immediate jeopardy, and also creates a potential for endless compatibility problems. On reflection I think we should be very careful in developing a new version of SAX and attempt to break as little as possible when/if doing so. Some of the things that have been suggested can, I think, be safely added without jeopardizing compatibility, since they are additions above the driver level: - Parser filters - Driver management - More library functionality I support all these and think they would be very useful. Some ideas for how these could be done might be taken from the Python SAX extensions, which are described at: However, some of the other suggestions are a bit more troublesome: - A handler for basic lexical information - A handler for DTD information - The option for parsers not to break up PCDATA - Depending on the implementation driver querying might also end up on this list I think the lexical handler very useful, if only because it provides information about the DOCTYPE declaration. I'm not so sure about the handler for DTD information, which would most likely be much harder to develop and also rather difficult to use in any sensible way. As for the PCDATA option I agree with David in preferring a ParserFilter for this purpose. In spite of the potential problems I do support the driver querying and management part. It's been very useful to us in Python, and now that I'm doing some work in Java as well I wish I had it there too. Being able to say saxexts.XMLValParserFactory.make_parser() and knowing that you'll get a validating parser if one is installed or else an exception is really convenient. Finally, there is the question of backwards compatibility. As far as I can see the extensions I labelled 'safe' can be done with no harm to compatibility. The two new handlers are slightly trickier, but might be incorporated in the same manner that SAX for Python did its extensions: a separate sub-interface of Parser called ExtendedParser. One could then add an implementation of ExtendedParser that wrapped SAX 1.0 parsers and threw SAXUnsupportedExceptions when attempts were made to call the new methods. This would avoid the need for a new package name entirely. The driver querying methods might get a separate interface or it might be added to the ExtendedParser like in Python. These are my 0.02 NOK, --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sbagora at inacom.com Fri Nov 13 17:45:06 1998 From: sbagora at inacom.com (Sudhanshu Bagora) Date: Mon Jun 7 17:06:32 2004 Subject: Question about sending and receiving XML via ASP's Message-ID: <862566BB.00617291.00@omaclnmta01.inacom.com> Hi, I am currently working on a project which passes XML back and forth via ASP's. The scenario is as follows: 1) Page1.asp is creating an XMLBlock and passing it on to Page2.asp 2) Page2.asp saves the XMLBlock in a file and loads it, parses it and does some processing. Then Page2.asp forms another new ReturnXMLBlock which needs to be passed back to Page1.asp The method i am using to pass XMLBlock from Page1.asp to Page2.asp is ..... XMLDoc.URL = "http://.... Page2.asp?XMLBlock=" + XMLBlock .... Parse ReturnXMLBlock ...... and am expecting ReturnXMLBlock to be loaded in XMLDoc after this above statement is processed. This above operation is not successful. I have tried to do it in IE4.0 as well as IE5.0 The question i have is: a) Is this valid? Can i expect to get a new ReturnXMLBlock after the XMLDoc.URL statement is processed. b) If it is valid - What am i doing wrong? c) What would the syntax be in Page2.asp where i am returning the ReturnXMLBlock? In Script tags or plain XML tags?? Any help would be greatly appreciated. Thanks in advance. -sudhanshu- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Nov 13 17:53:30 1998 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:06:32 2004 Subject: Infoseek announces XML search engine Message-ID: <3.0.32.19981113095212.00b56100@pop.intergate.bc.ca> http://www.wired.com/news/news/technology/story/16221.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dave at userland.com Fri Nov 13 18:06:23 1998 From: dave at userland.com (Dave Winer) Date: Mon Jun 7 17:06:32 2004 Subject: Infoseek announces XML search engine In-Reply-To: <3.0.32.19981113095212.00b56100@pop.intergate.bc.ca> Message-ID: <3.0.5.32.19981113100459.00f0ab60@scripting.com> Tim, when I spoke with Chris, I asked him to look at this piece I wrote last year, about a pragmatic way search engines could use XML to broaden their reach and catch more recent developments on the web. Here's the piece: http://www.scripting.com/davenet/97/12/realWorldXml.html And here's the prototype of the file the search engines could read: http://www.scripting.com/siteChanges.xml We did this stuff before I was on this list, since people often ask for applications, and since otherwise the search engine-XML connection is not well understood, I wanted to add this to the record, I think it's easy and could be an important way for XML to quickly make the web more useful. Dave -------------------------------------- http://www.userland.com/directory.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Nov 13 18:38:01 1998 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:06:32 2004 Subject: Data attributes (was: Stylesheets considered limiting) References: <3.0.5.32.19981111210459.009ba510@amati.techno.com> <3.0.1.16.19981111210320.302765a0@pop3.demon.co.uk> <3.0.5.32.19981110221631.008ec900@amati.techno.com> <3.0.1.16.19981111021455.6f2f8962@pop3.demon.co.uk> <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> <3.0.5.32.19981112075504.008d5b10@amati.techno.com> <3.0.5.32.19981112234611.0098ecd0@amati.techno.com> Message-ID: <364C7D0B.BBEA4607@locke.ccil.org> W. Eliot Kimber wrote: > Sure. Thank you. Your writeup is very clear as always. > > width CDATA "100" > height CDATA "100"> > > [ width="640" height="480"]> Okay. Since XML cannot have references to unparsed entities*, the above can be treated as syntactic sugar. Specifically, is effectively syntactic sugar for: but bypassing the SGML/XML restriction that the validity or invalidity of attributes cannot depend on the values of other attributes. If "big-picture" were not a "gif" entity, the "gif.width" and "gif.height" attributes would be invalid. It would be very easy to create a ParserFilter on top of a validating XML parser (so that we reliably get attribute types) that processed the above declarations (wrapped in PIs) and generated extra attributes of the type shown above. *(Come to think of it, references to unparsed entities could be seen as syntactic sugar for an anonymous empty element with a single anonymous ENTITY attribute.) > notation NOTATION(MyQuery) MyQuery > table CDATA #REQUIRED > select-on CDATA #REQUIRED > where (name|ssnum|phone) #REQUIRED> In this case, though, the notation attributes seem to blend with the regular attributes, and are redeclared in the ATTLIST declaration for the element. What happens in the case of an element type that can be governed by alternative notations? -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Fri Nov 13 18:43:37 1998 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:06:32 2004 Subject: Back to XObjects was XLink - where a References: > <364B26B1.8A6AC5B@eng.sun.com> Message-ID: <364C7AEE.810E91B3@eng.sun.com> Responding to Graham's comment about binding elements to classes, I wrote: > > I've got an API driven approach, which could be driven based on config > files using an XML syntax like the one I suggested a while back. But > the API can also be driven in other ways, too. ("Element Factory", an > interface that can be implemented in many ways. I'll post more later.) It's "later", after a busy Thursday and a morning cuppa Java ... // to customize DOM tree building public interface ElementFactory { // like DOM createElement, used with namespaces disabled ElementEx createElementEx (String tag); // ... namespace aware version ElementEx createElementEx (String uri, String tag); } // implemented by elements and attributes for namespace support public interface NamespaceScoped extends Node { // returns "local part" of node name, no prefix String getLocalName (); // returns URI of namespace, null for default String getNamespace (); } // extended element functionality public interface ElementEx extends Element, NamespaceScoped { // namespace-aware String getAttribute (String uri, String name); // namespace-aware Attr getAttributeNode (String uri, String name); // basic delegation hook Object getUserObject (); // basic delegation hook void setUserObject (Object obj); // has more, for other purposes ... ... } That "user object" bit is a delegation hook; it's just stored for use by applications. Supports many-to-one association of elements to objects in an application model, no backlink. I don't know if it suffices, but I suspect that any more would become specific to some problem domain. For the moment, I'm not listing the DocumentBuilder bit here. Basically, you can feed an ElementFactory in to a DocumentBuilder and it'll use it when building the DOM tree; if you don't, it has defaults. And there's a simple table-driven implementation that's easily customized. (Maybe it'll learn a standard "config file" syntax -- in XML of course!) Other implementations are clearly possible, including ones using heuristics like those like James Anderson posted. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From avirr at LanMinds.Com Fri Nov 13 19:24:14 1998 From: avirr at LanMinds.Com (Avi Rappoport) Date: Mon Jun 7 17:06:32 2004 Subject: Infoseek announces XML search engine In-Reply-To: <3.0.32.19981113095212.00b56100@pop.intergate.bc.ca> Message-ID: At 9:53 AM -0800 11/13/98, Tim Bray wrote: > http://www.wired.com/news/news/technology/story/16221.html Thanks, Tim, for the pointer. The Infoseek Ultraseek page is at: but there is no info there yet on the version 3.0. Avi ________________________________________________________________ Avi Rappoport, Web Site Search Tools Maven: Guide to Site Indexing and Local Search Engines: xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at dns.isogen.com Fri Nov 13 21:59:30 1998 From: eliot at dns.isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:06:33 2004 Subject: Data attributes (was: Stylesheets considered limiting) In-Reply-To: <364C7D0B.BBEA4607@locke.ccil.org> References: <3.0.5.32.19981111210459.009ba510@amati.techno.com> <3.0.1.16.19981111210320.302765a0@pop3.demon.co.uk> <3.0.5.32.19981110221631.008ec900@amati.techno.com> <3.0.1.16.19981111021455.6f2f8962@pop3.demon.co.uk> <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> <3.0.5.32.19981112075504.008d5b10@amati.techno.com> <3.0.5.32.19981112234611.0098ecd0@amati.techno.com> Message-ID: <3.0.5.32.19981113155708.00a017c0@amati.techno.com> At 01:40 PM 11/13/98 -0500, John Cowan wrote: > [ width="640" height="480"]> Okay. Since XML cannot have references to unparsed entities*, the above can be treated as syntactic sugar. Specifically, is effectively syntactic sugar for: Not necessarily. What if I have this situation: ... find something Here, the notation of the element governs the query--the graphic it references is just presentation stuff in this case. So I can't put the entity's attributes on the element as well--they represent two different semantic domains, that of the query (represented by the element) and that of the graphic, represented by the entity. Without data attributes and the ability to specify them as part of an entity declaration, I cannot keep these two domains clearly and unambiguously separate. >> > notation NOTATION(MyQuery) MyQuery >> table CDATA #REQUIRED >> select-on CDATA #REQUIRED >> where (name|ssnum|phone) #REQUIRED> > >In this case, though, the notation attributes seem to blend with >the regular attributes, and are redeclared in the ATTLIST declaration >for the element. What happens in the case of an element type that >can be governed by alternative notations? If the attributes are explicitly declared, its attributes have to be the union of the attributes for all the possible governing notations. However, if you only declare the notations and not the attributes (or even the element type), then there's no redundancy of declaration as long as you specify the notation. You can only specify one notation per element instance (which might be a design bug now that I think about it--I can thing of cases where the same element might be reasonably governed by different notations relevant to different processing domains). One of the things that the notation attributes do is make it clear that a particular attribute of an element should be (or can be) relevent to the notation that governs the element. Cheers, E. --
W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at dns.isogen.com Sat Nov 14 05:31:53 1998 From: eliot at dns.isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:06:33 2004 Subject: XLink - where are we? [tiny amount of frustration] In-Reply-To: <3.0.1.16.19981111210320.302765a0@pop3.demon.co.uk> References: <3.0.5.32.19981110221631.008ec900@amati.techno.com> <3.0.1.16.19981111021455.6f2f8962@pop3.demon.co.uk> <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> Message-ID: <3.0.5.32.19981113233212.009b5300@amati.techno.com> At 09:03 PM 11/11/98, Peter Murray-Rust wrote: >I am probably tilting at windmills but I am attempting something like this >in JUMBO - I have about 5 approaches to styles without stylesheets. >(a) redisplay it as raw XML. Not as silly as it sounds for many documents. >(b) pretty-print it and display as XML. Extremely useful for many documents >(c) Reformat start tags as bold and add NL after PCDATA. Works pretty well >for many documents. >(d) map every element onto a Java class. >(e) allow the user to customise some or all elements with styles. I shall >use Swing for the rendering. Then, I suppose I could write the styles out >as XSL if anyone cares. Below is a DSSSL style sheet that implements a simple form of option (b). I have tested this with the latest HyBrick and with Jade. It should be usable by itself (no other catalog configuration needed). Choose it as the style for those documents that don't already have one. It's simple enough that you should be able to figure out how to adjust it to your own taste without having the DSSSL spec to hand. Note that HyBrick's Xlink support doesn't require any style work, so documents using this style sheet will still exhibit their Xlink-defined behavior. Cheers, E. --- cut here --- ]> ;------------------------------------------------------------ ; "Tag View" DSSSL Spec. Provides a default style for ; SGML or XML documents. ; ; Author: W. Eliot Kimber ; ; ; Change History: ; ; $Header$ ; ; $Log$ ; ;-------------------------------------------------------------------------- (define debug (external-procedure "UNREGISTERED::James Clark//Procedure::debug")) (define *rgb-color-space* (color-space "ISO/IEC 10179:1996//Color-Space Family::Device RGB")) (define midnight-blue-color (color *rgb-color-space* (/ 25 255) (/ 25 255) (/ 112 255))) (define primary-blue-color (color *rgb-color-space* (/ 25 255) (/ 25 255) (/ 255 255))) (define sea-green-color (color *rgb-color-space* (/ 46 255) (/ 139 255) (/ 87 255))) (define red-color (color *rgb-color-space* (/ 255 255) (/ 0 255) (/ 0 255))) ;-------------------------------------------------- ; Define general-purpose functions: ;-------------------------------------------------- (define (copy-attributes nd indent) (let loop ((atts (named-node-list-names (attributes nd))) (resultstr "")) (if (null? atts) resultstr (loop (cdr atts) (let* ((name (car atts)) (value (attribute-string name nd))) (if value (string-append resultstr "&#RE;" indent name "=\"" value "\"") resultstr)))))) (define (ancestors nl) (node-list-map (lambda (snl) (let loop ((cur (parent snl)) (result (empty-node-list))) (if (node-list-empty? cur) result (loop (parent cur) (node-list cur result))))) nl)) (define (copy-string string count) (let loop ((resultstr "") (count count)) (if (equal? count 0) resultstr (loop (string-append resultstr string) (- count 1))))) (declare-initial-value font-family-name "iso-monospace") (root (make scroll (process-children))) (default (let ((indent (copy-string " " (node-list-length (ancestors (current-node)))))) (sosofo-append (make paragraph color: sea-green-color (literal indent "<" (gi (current-node)) (copy-attributes (current-node) (string-append indent " ")) ">")) (make paragraph lines: 'asis (process-children)) (if (node-property 'must-omit-end-tag? (current-node)) (empty-sosofo) (make paragraph color: sea-green-color (literal indent "")))))) ;--- end of spec --
W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sblackbu at erols.com Sat Nov 14 11:52:17 1998 From: sblackbu at erols.com (Samuel R. Blackburn) Date: Mon Jun 7 17:06:33 2004 Subject: Creation of XML documents Message-ID: <006201be0fc5$42e80b30$01010101@sammy> If a non-validating C++ library will help, take a look at the XML classes in the freeware Win32 Foundation Classes (WFC) at http://ourworld.compuserve.com/homepages/sam_blackburn/wfc.htm HTH, Sam -----Original Message----- From: Richard Anderson To: XMLDEV ; Wester, JJ (ICT) Date: Friday, November 13, 1998 10:26 AM Subject: Re: Creation of XML documents >> I would like to have a 'library' to which the application developer >> could say 'using this DTD please >> instantiate a XML document and help me to fill it in'. > >I've got a C++ toolkit that is under development that sort of mets these >requirements. It has a SAX+DOM interface. ( The COM variant will follow >very shortly for use in VB etc ) > >Using the SAX interface you can build a 'template' DOM and then fill in the >missing bits. > >The toolkit has *no* dependancies on MSIE etc. > >Heres some sample code for creating an XML EMAIL: > >************ CODE SECTION START **************** > >pDoc = pDOMAPI->createDocument(); > pRootElement = pDoc->createElement( L"EMAIL" ); >pFrom = pDoc->createElement( L"From" ); > pRootElement->appendChild( pFrom ); >pFrom->setAttribute( L"Priority", L"High" ); > pFrom->setAttribute( L"DeliveryReceipt", L"Yes" ); > pFrom->setAttribute( L"ReturnReceipt", L"Yes" ); > >pText = pDoc->createTextNode( L"RJA@arpsolutions.demon.co.uk" ); >pFrom->appendChild( pText ); > >pTo = pDoc->createElement( L"To" ); >pRootElement->appendChild( pTo ); >pText = pDoc->createTextNode( L"enquires@arpsolutions.demon.co.uk" ); >pTo->appendChild( pText ); > >pSubject = pDoc->createElement( L"Subject" ); >pRootElement->appendChild( pSubject ); >pText = pDoc->createTextNode( L"XML/DOM/SAX C++ Toolkit" ); >pSubject->appendChild( pText ); > >pComment = pDoc->createComment(L"Main body of Email follows"); >pRootElement->appendChild( pComment ); >pBody = pDoc->createElement( L"Body" ); >pRootElement->appendChild( pBody ); > >pText = pDoc->createTextNode( L"Seems OK so far." ); >pBody->appendChild( pText ); > >pText = pDoc->createTextNode( L"I'll have to try harder to break it." ); >pBody->appendChild( pText ); > >pPI = pDoc->createProcessingInstruction(L"PI", > L"That's all folks!" ); > >pCC = pDoc->createElement( L"CC" ); >pText = pDoc->createTextNode( L"xml_toolkit@arpsolutions.demon.co.uk" ); >pCC->insertBefore( pText, NULL ); >pRootElement->insertBefore( pCC,pTo ); > >CStdioWideStream stream; >pDOMAPI->writeXML( pRootElement, &stream ); > >*********** CODE SECTION END *************** > >If your interested I can send you the C++ alpha toolkit. > >Regards, > >Richard. > >*********************************************** >* E-Mail mailto:RJA@arpsolutions.demon.co.uk * >* WEB http://www.arpsolutions.demon.co.uk * >*********************************************** > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Sat Nov 14 13:55:25 1998 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:06:33 2004 Subject: XSA: Almost ready to go Message-ID: I'm pleased to announce that one XML application that actually uses XML for exchanging informaation over the network and automatically acting on it is almost ready for use. The XSA system I proposed here earlier this year[1] has now been polished, documented and had software developed for it. The system is now going into beta for a while until I'm satisfied that all bugs have been shaken out of the software and that nobody will come forth and demand changes to the system. This should hopefully only take a couple of weeks. So anyone who is interested is invited to have a look at the XSA web site at Any feedback on the site, the system or the code is most welcome. When I first suggested this there was some talk about coordinating this with Trove, LSM and OSD. These avenues have been explored, and have so far only lead to adding support for OSD to the system, mainly because these systems had other goals than XSA has. --Lars M. [1] xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at dns.isogen.com Sat Nov 14 16:28:37 1998 From: eliot at dns.isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:06:33 2004 Subject: ISO/IEC Standardization Activities: Procedural Background Message-ID: <3.0.5.32.19981114101714.008a33a0@amati.techno.com> [Warning: boring but important standards stuff. Please read to the end if you can.] The ISO group responsible for the SGML family of standards was recently promoted from a working group to an ISO subcommittee (SC). What was ISO/IEC JTC1/WG4 (nee SC18/WG8)is now ISO/IEC JTC1/SC34 Information Technology -- Document Description and Processing Languages (see ). We had our first meeting as an SC this last week. At this meeting we approved the organization of SC34 into three working groups: WG1 covers representation languages (SGML, SDIF, etc.), WG2 covers fonts and styles (DSSSL, etc.), and WG3 covers "relations", that is, hyperlinking and the like (HyTime, ISO HTML, ISMID, Topic Maps). Note that this does *NOT* include media formats like MHEG or JPEG, APIs like CORBA, or networking protocols. I am acting convenor of WG3 (and will likely be accepted as convenor when permanent convenors are selected). As convenor of WG3 I'd like to stress the desire of WG3 that the standards we produce be both compatible with and useful for XML-based data and applications. I'd also like to stress our desire to be as open and inclusive as we possibly can be within the constraints ISO imposes. The constraints on us are entirely procedural--we have no secrecy requirements as far as I know (but I could be wrong--this is a bit of a bureaucratic morass). The only real limitation on participation is the ability to attend the formal ISO meetings. However, anyone is free to review and comment on any of the standards under development and published as official SC34 documents. It is up to the editors of each standard to define how they do their development and can involve anyone they feel like. The procedural constraint is on how comments get officially communicated to the editors. Once a standard has reached a state of near completion, the ISO policies are designed to ensure that the editorial process is as visible as possible. It is designed to make last-minute changes visible to the voting member bodies (and observing public) and help ensure that major design decisions are not made at the last minute. The current ISO stages for draft standards are "committee draft", "final committee draft", and "draft international standard" (DIS). Committee drafts are works in progress and can be quite dynamic as the editors do their work. Final committee drafts are supposed to be true final drafts, requiring only minor changes before progressing to draft international standard. However, several final committee drafts can be issued before a standard goes to DIS stage [I know, it's doublespeak--I didn't make this stuff up, I just try to work within the rules I've been given.] The key thing is that once a standard reaches final committee draft stage, the editors are constrained to only making changes that correspond to comments submitted by member bodies. The member body comments are formal SC34 documents that everyone can see. This helps ensure that there are no surprises when the DIS draft is sent for ballot. Unfortunately, it also slows down the editorial process because each new FCD requires a new ballot period of 2-4 months. What does this mean to you, the average XML-Dev techie? It means that if you are interested in one of WG3's standards and have some constructive comments on it AND that standard is in final committee draft state, THEN you must forward your comments to a member of one of the national delegations to SC34. If the standard is just a committee draft then you can communicate with the editors directly. If you don't know how to contact the editors, send me mail and I'll hook you up. If you are not in the U.S. and you do know who your SC34 representative or head of delegation is, then you can just forward your comments to them. They should evaluate your comments and, if they are factually correct and also consistent with the overall national opinions on the standard, integrate them with any others they have and submit them as part of the formal process. If your comments are not consistent with the national opinion, they should work with you to understand the technical issues to make sure your concerns have gotten a fair consideration. They should also give your name to the editors of the standard so that they can communicate with you to better understand your comments. [Note: the time order of these actions is not exposed by the formal process.] NOTE: Each national body defines it's own rules and policies for how it develops its responses to ballots and requests for comment. The above is the suggested policy for WG3 participants. I can't guarantee that any particular member body will be receptive to unsolicited comments. However, I think it's safe to say that the currently active member bodies are all receptive to comments from anyone with legitimate and constructive technical insight. If you are in the U.S., or you don't know who your SC34 delegates are (and I don't know why you would--we're not exactly international celebrities), you can forward your comments to me in my capacity as a member of the U.S. delegation to SC34 and I will either incorporate your comments with ours or pass them on to the appropriate national body, if there is one (not all ISO members send delegations to SC34). Our interest in WG3 is to produce standards that are relevant, useful, and technically sound. I want to make the process of developing our standards as open and productive as possible. To that end, I will periodically make announcements here of WG3 activity that I think is relevant or of potential interest to the XML developer community. Thanks, Eliot Acting Convenor, ISO/IEC JTC1/SC34/WG3 --
W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at dns.isogen.com Sat Nov 14 16:28:41 1998 From: eliot at dns.isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:06:33 2004 Subject: Announce: Topic Map Standard out for Final Committee Draft Ballot Message-ID: <3.0.5.32.19981114102847.008a5e10@amati.techno.com> The Topic Navigation Map (Topic Map) standard, ISO/IEC 13250, is out for ballot to become a final committee draft, the last stage before becoming a full standard. You can find the draft being balloted at . This ballot lasts until the end of February. However, in order to progress the standard as quickly as possible, we'd like to get comments to the editors before the end of the year. The Topic Map standard is similar to RDF in some ways (but has an essentially different focus and intended domain of application). It is also designed to be implementable using Xlink. It defines a relatively simple (but still powerful) approach to representing rich relationships among information objects. If you are working with Xlink, especially extended links, or thinking about how you might use them productively, I urge you to take a look at the standard. I have started putting together some examples, both to test the standard and to use in my Xlink class. I will be posting these as soon as possible, hopefully within the week. NOTE: some of the prose in the standard is currently a bit dense and abstract. the U.S., France, and Norway have developed comments against the current draft that should help to make things much clearer (we hope). If you're trying to read the standard and not getting it, send me mail and I'll see what I can do to help. The editors are already working on fixing these problems, but they can't be made officially visible at this time because of process constraints. We all think that topic maps could be a really interesting application of Xlink. They are, I think, relevant to some of the work that Peter Murry-Rust has been doing on developing online glossaries and the like. Cheers, Eliot --
W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dave at userland.com Sat Nov 14 17:04:22 1998 From: dave at userland.com (Dave Winer) Date: Mon Jun 7 17:06:33 2004 Subject: Announce: Topic Map Standard out for Final Committee Draft Ballot In-Reply-To: <3.0.5.32.19981114102847.008a5e10@amati.techno.com> Message-ID: <3.0.5.32.19981114084806.00ef89b0@scripting.com> What kind of applications would we use the Topic Map structures for? It always helps me to understand this kind of stuff if I can understand a compelling application for it. Dave At 10:28 AM 11/14/98 -0600, you wrote: >The Topic Navigation Map (Topic Map) standard, ISO/IEC 13250, is out for >ballot to become a final committee draft, the last stage before becoming a >full standard. You can find the draft being balloted at >. This ballot lasts until >the end of February. However, in order to progress the standard as quickly >as possible, we'd like to get comments to the editors before the end of the >year. > >The Topic Map standard is similar to RDF in some ways (but has an >essentially different focus and intended domain of application). It is also >designed to be implementable using Xlink. It defines a relatively simple >(but still powerful) approach to representing rich relationships among >information objects. > >If you are working with Xlink, especially extended links, or thinking about >how you might use them productively, I urge you to take a look at the >standard. I have started putting together some examples, both to test the >standard and to use in my Xlink class. I will be posting these as soon as >possible, hopefully within the week. > >NOTE: some of the prose in the standard is currently a bit dense and >abstract. the U.S., France, and Norway have developed comments against the >current draft that should help to make things much clearer (we hope). If >you're trying to read the standard and not getting it, send me mail and >I'll see what I can do to help. The editors are already working on fixing >these problems, but they can't be made officially visible at this time >because of process constraints. > >We all think that topic maps could be a really interesting application of >Xlink. They are, I think, relevant to some of the work that Peter >Murry-Rust has been doing on developing online glossaries and the like. > >Cheers, > >Eliot >-- >
>W. Eliot Kimber, Senior Consulting SGML Engineer >ISOGEN International Corp. >2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 >www.isogen.com >
> >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > -------------------------------------- http://www.userland.com/directory.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Sat Nov 14 18:48:27 1998 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:06:33 2004 Subject: Data attributes (was: Stylesheets considered limiting) References: <3.0.5.32.19981111210459.009ba510@amati.techno.com> <3.0.1.16.19981111210320.302765a0@pop3.demon.co.uk> <3.0.5.32.19981110221631.008ec900@amati.techno.com> <3.0.1.16.19981111021455.6f2f8962@pop3.demon.co.uk> <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> <3.0.5.32.19981112075504.008d5b10@amati.techno.com> <3.0.5.32.19981112234611.0098ecd0@amati.techno.com> <3.0.5.32.19981113155708.00a017c0@amati.techno.com> Message-ID: <364DD0B0.62F3BCC@mecomnet.de> the situation described below is a case where a namespace prefix is well used to ensure unambiguous naming. if one uses the notation or entity name as the name prefix it awards the same expressive power as the implicit qualification of data attributes. W. Eliot Kimber wrote: > > ... > > Not necessarily. What if I have this situation: > > > > > > ... > find something > > Here, the notation of the element governs the query--the graphic it > references is just presentation stuff in this case. So I can't put the > entity's attributes on the element as well--they represent two different > semantic domains, that of the query (represented by the element) and that > of the graphic, represented by the entity. Without data attributes and the > ability to specify them as part of an entity declaration, I cannot keep > these two domains clearly and unambiguously separate. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From creitzel at mediaone.net Sat Nov 14 20:24:08 1998 From: creitzel at mediaone.net (Charles Reitzel) Date: Mon Jun 7 17:06:33 2004 Subject: New SAX Version Message-ID: <199811142023.PAA05204@chmls06.mediaone.net> 1) Agree w/ David Megginson's priorities: >... the following for SAX 1.0 are probably higher priorities: > >1. A proper specification for SAX (JavaDoc just doesn't cut it). >2. A test suite for SAX conformance. >3. Canonical SAX interfaces for Perl (on top of XML:Parser) and C++. I volunteer to work on a C++ implementation. 2) Some additional SAX features (e.g. driver/filter manager, Element/Object factory) are "orthogonal" to the core interfaces. Why not create separate packages for these? A default rule like "Additional SAX packages will depend only on the core package" will keep things manageable. 3) Isn't it too soon to freeze any XML related specs because of the "installed base"? I think the time tested approach is to maintain existing interfaces for backward compatibility for release or three after they are officially deprecated. Sometimes this means that new names must be used where you would prefer not to but, hey, that's life in the big city. An update once a year won't thrash anybody's code. So, it's better to get in fixes for those few wee problems with the core API's than wait for the installed base to grow by an order of magnitude and it really will have to freeze. 4) Is it a good idea to hard-wire each SAX version to an XML version? I think ODBC would be a pretty good model to follow here. Each version of the driver manager (ODBC itself) defines how it will interact with drivers written for earlier versions of ODBC. ODBC v2 can handle v2 and v1, etc. ODBC pretty much froze a v2 because a) it was pretty complete and b) the installed base became huge. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lisarein at finetuning.com Sat Nov 14 23:37:14 1998 From: lisarein at finetuning.com (Lisa Rein) Date: Mon Jun 7 17:06:33 2004 Subject: Oracle XML References: <364AB030.EFAC2ADE@sophia.inria.fr> <364B6296.3598F440@infinet.com> Message-ID: <364E030D.FD0F3655@finetuning.com> actually -- all this says to me is that they never even ran it through a parser...(not even once :-) Not good for a high-profile document that many others will attempt to emulate.... I say - bad dog - no biscuit! lisa Tyler Baker wrote: > > "Philippe Le H?garet" wrote: > > > K. Travis Walsh wrote: > > > > > > Oracle officially anounced today it's XML strategy. > > > > > > For those of you who didn't know there is a java parser built into this > > > new release of Oracle 8i. > > > > Like netscape, They can't write well-formed XML document. > > > > is not a well-formed XML document. > > > > Hey give them a break. Forgetting that little question mark is something > everyone has done from time to time when handcrafting XML documents. I know, > cause I have done this many times. If XML did not inherit SGML's syntax, I bet > we would all of a much cleaner syntax to work with with less user errors as a > result. > > Tyler > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Sun Nov 15 05:03:08 1998 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:06:33 2004 Subject: Oracle XML In-Reply-To: <364E030D.FD0F3655@finetuning.com> References: <364AB030.EFAC2ADE@sophia.inria.fr> <364B6296.3598F440@infinet.com> Message-ID: <199811150502.AAA05690@hesketh.com> Tyler Baker wrote: > > "Philippe Le H?garet" wrote: > > > K. Travis Walsh wrote: > > > > > > Oracle officially anounced today it's XML strategy. > > > > > > For those of you who didn't know there is a java parser built into this > > > new release of Oracle 8i. > > > > Like netscape, They can't write well-formed XML document. > > > > is not a well-formed XML document. > > > > Hey give them a break. Forgetting that little question mark is something > everyone has done from time to time when handcrafting XML documents. I know, > cause I have done this many times. If XML did not inherit SGML's syntax, I bet > we would all of a much cleaner syntax to work with with less user errors as a > result. > > Tyler All right, already. Why is this worthy of a posting to XML-Dev rather than a polite note to their Webmaster? Proving that you know more about XML than Oracle and Netscape, however easy and gratifying that may be, doesn't further the knowledge base of the XML community in any way that I at least consider significant. Could we help people and companies fix problems rather than broadcast problems onto the largest screen possible? Pointing out problems like this only makes XML's oddities seem like more of a barrier to 'proper' development than they really are. Simon St.Laurent Dynamic HTML: A Primer / XML: A Primer Cookies / Sharing Bandwidth (November) Building XML Applications (December) http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dave at userland.com Sun Nov 15 13:05:01 1998 From: dave at userland.com (Dave Winer) Date: Mon Jun 7 17:06:33 2004 Subject: Definitive test suite? (was Re: Oracle XML) In-Reply-To: <199811150502.AAA05690@hesketh.com> References: <364E030D.FD0F3655@finetuning.com> <364AB030.EFAC2ADE@sophia.inria.fr> <364B6296.3598F440@infinet.com> Message-ID: <3.0.5.32.19981115050420.00f3b990@scripting.com> Simon, it's going to get a lot worse before it gets better. There are a billion parsers being written right now, all will accept different flavors of XML. There still is no definitive test suite of XML files they have to parse. If we're serious about having compatibility and offering users real choices, we can probably still fix this, but soon, the cat will be out of the bag, as they say, and it'll be too late. No individual commercial vendor has the credibility or motivation to establish this test suite. But a famous book author or two could do it. ;-> Dave -------------------------------------- http://www.userland.com/directory.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at dns.isogen.com Sun Nov 15 14:51:09 1998 From: eliot at dns.isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:06:33 2004 Subject: Announce: Topic Map Standard out for Final Committee Draft Ballot In-Reply-To: <3.0.5.32.19981114084806.00ef89b0@scripting.com> References: <3.0.5.32.19981114102847.008a5e10@amati.techno.com> Message-ID: <3.0.5.32.19981115084029.0099ebf0@amati.techno.com> At 08:48 AM 11/14/98 -0800, Dave Winer wrote: >What kind of applications would we use the Topic Map structures for? > >It always helps me to understand this kind of stuff if I can understand a >compelling application for it. A topic map consists, fundamentally, of two kinds of things: topics and associations. A topic is an object that represents a single rhetorical topic or subject. For example, "XML parser" might be one topic in a topic map about XML. A topic serves to associate the abstract idea of the topic with occurrences of that topic: XML Parser Notice that this serves to impose the semantic label "XML Parser" onto the occurrences addressed by the occur element. Thus a topic can assert that a given object is an occurrence of some kind of thing. This lets you construct a classfication or descriptive layer on top of existing data. The topics essentially represent opinions about the data. Different topic map authors might express different opinions about the same data. Because the form the opinions are expressed in is standardized and consistent (topics), they can be reasonably compared to some degree. Because the topics are expressed formally as hyperlinks (here using Xlink, but also doable using HyTime), they are naturally navigable using whatever hyperlinking support you have lying about (e.g., HyBrick, PHyLIS, etc.). Associations relate topics to each other. To continue the XML topic map idea, might have a relation "standard-interface-for" that I use to relate the topic "XML parser" to the topic "SAX": In many ways, this is like RDF: you can impose properties onto data objects and relate data objects together using typed links. It may be that topic maps are one way to express RDF abstractions, I don't know (I don't know enough about RDF). But while RDF seems to be designed primarily to support the addition or representation of metadata about objects, topic maps are designed for the creation of knowledge bases imposed on data of any type, and, in particular. In any case, Topic Maps are not intended to compete with RDF--they are, in essence, different views of the same abstraction: objects with properties and relations among them. So what would you use topic maps for? I think one compelling use is as an annotative or descriptive layer over things like encyclopedias, dictionaries, databases, and the like. They might be used to enhance management information systems by providing a simple but rich and standardized way to capture analysis applied to existing data, such as market reports, sales numbers, etc. Topic maps can be a way to augment search and retrieval by providing a form of index over a larger, more amorphous body of data. Many documents can be turned into topic maps simply by labeling the existing components as topics. For example, you can think of a command reference document as a topic map where every command description is a topic. The reason for standardizing this concept is that it lets you build generic topic map engines that understand the specific properties of topics and associations and can therefore manage knowledge of those properties in a crisp and efficient way, making the information available to processing systems. It also allows the meaningful and automatic merging of topic maps because it's clear how the components of each relate to each other as objects. I'm not sure I've answered the question very well, but maybe I've provided enough of a taste for what topic maps do that applications will suggest themselves. I've left out a number of important and interesting details in the discussion above, but I think I've conveyed the flavor. Cheers, Eliot --
W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at dns.isogen.com Sun Nov 15 14:51:10 1998 From: eliot at dns.isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:06:34 2004 Subject: Data attributes (was: Stylesheets considered limiting) In-Reply-To: <364DD0B0.62F3BCC@mecomnet.de> References: <3.0.5.32.19981111210459.009ba510@amati.techno.com> <3.0.1.16.19981111210320.302765a0@pop3.demon.co.uk> <3.0.5.32.19981110221631.008ec900@amati.techno.com> <3.0.1.16.19981111021455.6f2f8962@pop3.demon.co.uk> <3.0.5.32.19981111030746.009f0c10@pophost.fsc.fujitsu.com> <3.0.1.16.19981110185833.09efb210@pop3.demon.co.uk> <3.0.32.19981103153315.00af8b30@pop.intergate.bc.ca> <3.0.5.32.19981112075504.008d5b10@amati.techno.com> <3.0.5.32.19981112234611.0098ecd0@amati.techno.com> <3.0.5.32.19981113155708.00a017c0@amati.techno.com> Message-ID: <3.0.5.32.19981115084926.008d6c60@amati.techno.com> At 07:49 PM 11/14/98 +0100, james anderson wrote: >the situation described below is a case where a namespace prefix is well used >to ensure unambiguous naming. if one uses the notation or entity name as the >name prefix it awards the same expressive power as the implicit qualification >of data attributes. I don't think it works though, because you don't know for sure that a given prefixed attribute is associated with a given entity--besides entity names are not, as far as I know, definable as namespace prefixes (that is, if a prefix happens to be spelled the same as an entity name, there's nothing that defines a necessary relationship between the two). If entity names were automatically name-space names it means that the name-space name space and the entity name space were irrevocalby combined. This would be pretty odd, I think, and certainly violate the basic principle of distinct name spaces staying distinct. By constrast, with the notation attribute mechanism, the elements attribute name space and notations attribute name space are only combined when the author or DTD designers do so by choice. And in any case, the Data Attributes for Elements facility provides a mechanism for disambiguating names in the two spaces when there are clashes. Cheers, E. --
W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Sun Nov 15 16:54:12 1998 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:06:34 2004 Subject: Definitive test suite? (was Re: Oracle XML) Message-ID: <005501be10b8$1a0c6780$0300000a@othniel.cygnus.uwa.edu.au> -----Original Message----- From: Dave Winer >There still is no definitive test suite of XML files they have to parse. Well, there's James Clark's. >No individual commercial vendor has the credibility or motivation to >establish this test suite. OASIS is just the group to have the credibility AND motivation and I believe they are doing it. James -- James Tauber / jtauber@jtauber.com / www.jtauber.com Associate Researcher, Electronic Commerce Network Curtin University of Technology, Perth, Western Australia Maintainer of : www.xmlinfo.com, www.xmlsoftware.com and www.schema.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Sun Nov 15 19:56:06 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:06:34 2004 Subject: Oracle XML References: <364AB030.EFAC2ADE@sophia.inria.fr> <364B6296.3598F440@infinet.com> <364E030D.FD0F3655@finetuning.com> Message-ID: <364F320D.7A35A804@infinet.com> Lisa Rein wrote: > actually -- all this says to me is that they never even ran it through a > parser...(not even once :-) Or worse yet, their parser is non-conformant... Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Sun Nov 15 20:05:31 1998 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:06:34 2004 Subject: Definitive test suite? (was Re: Oracle XML) References: <364E030D.FD0F3655@finetuning.com> <364AB030.EFAC2ADE@sophia.inria.fr> <364B6296.3598F440@infinet.com> <3.0.5.32.19981115050420.00f3b990@scripting.com> Message-ID: <364F3447.BDA9440@infinet.com> Dave Winer wrote: > Simon, it's going to get a lot worse before it gets better. There are a > billion parsers being written right now, all will accept different flavors > of XML. > > There still is no definitive test suite of XML files they have to parse. If > we're serious about having compatibility and offering users real choices, > we can probably still fix this, but soon, the cat will be out of the bag, > as they say, and it'll be too late. > > No individual commercial vendor has the credibility or motivation to > establish this test suite. But a famous book author or two could do it. ;-> > > Dave Jim Clark's test suite, though not perfect is a wonderful test suite that has helped me out immensely. If there should be any standard test suite for XML 1.0, I would suggest Mr. Clark's. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Sun Nov 15 23:20:57 1998 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:06:34 2004 Subject: Definitive test suite? (was Re: Oracle XML) In-Reply-To: James Tauber's message of Mon, 16 Nov 1998 00:50:48 +0800 Message-ID: <858.199811152320@doyle.cogsci.ed.ac.uk> > >There still is no definitive test suite of XML files they have to parse. > > Well, there's James Clark's. Absolutely. I have found this very useful. It has a particularly good range of not-well-formed cases. The main area where it is lacking is well-formed but invalid documents. -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SMUENCH at us.oracle.com Mon Nov 16 00:48:50 1998 From: SMUENCH at us.oracle.com (Steve Muench) Date: Mon Jun 7 17:06:34 2004 Subject: Oracle XML Message-ID: <199811160048.QAA19006@mailsun3> Thanks, Simon. Indeed if the previous critic reads forward to Figure 9, he'll see the properly formatted XML PI in the example. I had fixed those two typos in a final review but they didn't make it into what got pushed out to the public website. David Megginson quietly informed me the URL for SAX in the document needed updating, too, so both corrections should be imminent. Take care. ____________________________________________________________________________ Steve | Consulting PM & XML Technology Evangelist | smuench@oracle.com Muench | Java Business Objects Dev Team | geocities.com/~smuench -------------- next part -------------- An embedded message was scrubbed... From: "Simon St.Laurent" Subject: Re: Oracle XML Date: 14 Nov 98 21:05:15 Size: 3766 Url: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19981116/88ac4615/attachment.eml From Suli.Ding at geis.ge.com Mon Nov 16 01:51:33 1998 From: Suli.Ding at geis.ge.com (Ding, Suli (GEIS)) Date: Mon Jun 7 17:06:34 2004 Subject: Document to XML Convertor Message-ID: Peter, Very good question. I spent part of this weekend to make "doc2xml" to recognize a XML template file addition to the one you saw and used before. An new option -xXML_template_file is added. The XML template must has