From jarle.stabell at dokpro.uio.no Mon Feb 1 00:04:20 1999 From: jarle.stabell at dokpro.uio.no (Jarle Stabell) Date: Mon Jun 7 17:08:21 2004 Subject: XML query engines Message-ID: <01BE4D7F.66BCB0E0.jarle.stabell@dokpro.uio.no> Glassbox wrote: > > >There exists a neat trick which enables simple SQL-Select queries answering > >for two given nodes, whether one is a subnode of the other, and also how > >many levels deep, in constant time, assuming you do some simple > >preprocessing on the structure. (Assigning two integers to each node in the > >tree). > > Can you please explain it precisely ? I will only do a very short explanation, or else I will be much too tired (and/or late) at work tomorrow! :-) (it's already Monday here) (But I think you either will understand it directly, or have some fun playing with the details, it's a simple and elegant idea.) The basic idea is to assign an interval (using a pair of integers) to each node, and assigning them such that Interval(n1) contains Interval(n2) if and only if n2 is a subnode of n1. Then you only need to do two integer compares in order to check whether a given node is a subnode of another. You can think of the interval of a parent node p as the union of all the intervals of the subnodes. (or projection) To assign the integers, you may start with assigning the "left" side of the root node the number 1 (or whatever!), and traverse the tree and increase the number as appropriate. (When you come to a leaf node, you assign both a "leftside" number and "rightside" number) (I believe different strategies for when to increase the number may give slightly different extra info when comparing the intervals of two nodes, but I don't remember whether the differences are substantial. You may increase by 1 on each "step".) Cheers, Jarle Stabell Digital Logikk AS xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Feb 1 00:35:56 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:08:21 2004 Subject: XML query engines In-Reply-To: <01BE4D7F.66BCB0E0.jarle.stabell@dokpro.uio.no> from "Jarle Stabell" at Feb 1, 99 01:08:38 am Message-ID: <199902010124.UAA14220@locke.ccil.org> Jarle Stabell scripsit: > The basic idea is to assign an interval (using a pair of integers) to each > node, and assigning them such that Interval(n1) contains Interval(n2) if > and only if n2 is a subnode of n1. Or, equivalently: Assign a sequentially increasing number to each *tag* (start-tag or end-tag) in the document, treating an empty tag as a start-tag followed by an end-tag. Then e1 is a descendant of e2 iff e1.start > e2.start and e1.end < e2.end. Also, e1 is a left sibling of e2 (and e2 is a right sibling of e1) iff e1.end + 1 = e2.start; e1 is the leftmost child of e2 iff e1.start = e2.start + 1. Modeling the child/parent relationship is not so easy, and requires iteration. -- John Cowan cowan@ccil.org e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Feb 1 00:53:33 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:21 2004 Subject: XML query engines Message-ID: <3.0.32.19990131165229.00b3dc30@pop.intergate.bc.ca> At 08:24 PM 1/31/99 +73900, John Cowan wrote: >Assign a sequentially increasing number to each *tag* (start-tag or end-tag) >in the document, treating an empty tag as a start-tag followed by an >end-tag. Then e1 is a descendant of e2 iff e1.start > e2.start >and e1.end < e2.end. Also, e1 is a left sibling of e2 (and e2 is >a right sibling of e1) iff e1.end + 1 = e2.start; e1 is the leftmost >child of e2 iff e1.start = e2.start + 1. Modeling the child/parent >relationship is not so easy, and requires iteration. This structure has all sorts of advantages; that's how the Open Text SGML-savvy search engine of yore used to run. Fast as hell, equal access to any & all elements without performance penalty. But hard to update. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Mon Feb 1 02:03:48 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:08:21 2004 Subject: XML query engines Message-ID: <002601be4d86$7b59ed40$c9a8a8c0@thing2> >Modeling the child/parent >relationship is not so easy, and requires iteration. Why not add a third number, depth? An element's children are those within range and with a depth 1 greater. (Speaking entirely from ignorance here--never learned SQL.) Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjaakkol at cs.Helsinki.FI Mon Feb 1 06:04:25 1999 From: jjaakkol at cs.Helsinki.FI (Jani Jaakkola) Date: Mon Jun 7 17:08:21 2004 Subject: XML query engines In-Reply-To: <3.0.32.19990131165229.00b3dc30@pop.intergate.bc.ca> Message-ID: On Sun, 31 Jan 1999, Tim Bray wrote: > This structure has all sorts of advantages; that's how the > Open Text SGML-savvy search engine of yore used to run. Fast as > hell, equal access to any & all elements without performance > penalty. That is also how sgrep works (the two integers are actually indexes in to the SGML/XML-files, but the idea is the same). > But hard to update. That too. - Jani xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Mon Feb 1 10:30:07 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:08:21 2004 Subject: Interesting Monday. Message-ID: <5F052F2A01FBD11184F00008C7A4A80001136ADA@eukbant101.ericsson.se> Boy, you people sure can write when something stirs you up... It's 10:10am and I've only just got through my backlog of XML-Dev mail... Well, as the person who introduced the topic "Will XML eat the web?", I feel I should just add some points of note. I thank everyone who has contributed to this topic though. Firstly, I think there is still an issue with processing power and XML, although I can see that my system is poorly designed. Time for a rethink... The area where I can forsee potential problems is in e-commerce. Take an e-commerce transaction processing company that's moved to an XML transaction format. They don't have a shop web site, they just process credit card transactions for other sites. I imagine they are going to need to process hundreds of transactions per second. I don't for a second suggest that they store the XML as the primary data format (store it as a backup as suggested here) - it should immediately be put into an RDBMS. But to do that they have to parse each transaction. There's no caching that can go on here. Luckily that's their problem and not mine . My problem was slightly different. I needed to be ready for the 5.0 browsers (probably IE5, although I'd prefer NS5), and XML seemed ideal because we would be displaying/editing documents that look like data (or data that looks like a document if you like). We really needed an object database, but I needed to get moving quickly (a typical web project: "Can we have it yesterday"). Learning an object database wasn't a possibility. I already knew XML. So I looked at it like this - we could have it 2 ways: 1) Store XML now, process into HTML now, Transmit XML in the future. 2) Store in RDBMS now, process into HTML now, process into XML in the future. #1 looked like a nicer solution because it gives performance gains in the future, which #2 doesn't really (except perhaps XML is a lighter weight format to transmit than HTML). However this, it appears, is not the right way to go because RDBMS->*ML is always faster than XML->HTML. That's a lesson learned, and I thank you for it. Some of the points about caching are great when you're reading 1 XML file multiple times, but we're talking about 400 - 1000 XML files being accessed and constantly changed. A nicer solution would be an OODB. It's probably time to go shopping... Matt. -- http://come.to/fastnet Perl on Win32, PerlScript, ASP, Database, XML GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Mon Feb 1 10:40:06 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:08:21 2004 Subject: Interesting Monday. Message-ID: <5F052F2A01FBD11184F00008C7A4A80001136ADB@eukbant101.ericsson.se> I thought I should add: The topic "Will XML eat the web" was only designed to stir up interest. I certainly don't believe XML is doomed to failure. I've used it with fantastic success so far, aside from my speed issues. For example, it was a godsend in converting a database from MSSql to Postgresql. Matt. -- http://come.to/fastnet Perl on Win32, PerlScript, ASP, Database, XML GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paagarwal at hss.hns.com Mon Feb 1 11:06:11 1999 From: paagarwal at hss.hns.com (parul agarwal) Date: Mon Jun 7 17:08:21 2004 Subject: XSL Draft? Message-ID: <01BE4E15.32F4B410.paagarwal@hss.hns.com> Hi, I am new to XML developers list. I was looking at XSL. Does any one have an idea on its status? Has it still not become a standard? It was proposed more than a year back. Are there existing stylers in the market? Thanks in anticipation Parul xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Mon Feb 1 11:39:49 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:08:21 2004 Subject: Compound Documents - necessary for success? Message-ID: <01BE4DDF.134A6140@grappa.ito.tu-darmstadt.de> Marcus Carr wrote: > With all due respect Roger, I think that the problem is that we're both asking > questions and with few exceptions, nobody's answering. In my own case, I assume > that this is due to the fact that: > > a) creating compound documents with fragments using the same DTD as the parent > may cause problems, but that there would always be a better way to handle such > documents, > > b) nobody's sure whether this will be a problem once XLink, XPointer, XML > Fragments and X?? have spun their magic, > > c) I've not clearly explained what I think the problem is, > > d) I'm missing the point so totally that nobody feels that it even merits a > reply, > I've been following this conversation with interest. I'll hazard two guesses for the lack of answers. First is (b) -- schemas and fragments are likely to answer some, but not all, of these questions. Second is that these questions are on or ahead of the bleeding edge, so it's not surprising that nobody has answers yet. I think that many of us have a notion of a "compound document" and "reusing schemas" but that, for most of us, these notions don't go much beyond the actual words and a hazy, utopic, AI-intensive dream that XML documents will somehow magically recombine themselves to solve all of our problems. Let's look at a simple example. Suppose we have a DTD for NBA players: Now suppose we also have a DTD for heights: What I think a lot of people would like is to automagically combine these two DTDs so that the following document is valid: Joe Tall Iowa Talls 3 meters This does not currently work for two reasons. First, there is no way to express that a document is valid under two different DTDs. Second, the above document is clearly not valid under either of the above DTDs. To create such a document under the current spec, we need to rewrite players.dtd: %height; There are two important things to notice here: 1) We got nothing for free. That is, we had to write a new DTD because we have a new file type, and the new file type (DTD) is different from either of the previous file types. In Roger's case, he needs to generate new DTDs dynamically, as was mentioned in an earlier message. 2) When we wrote the new DTD, a *human* made the decision about where was legal. Anybody figuring out a foolproof way for a machine to do this usefully -- that is, without defining the content model of all elements as ANY -- will probably get a Turing Award for AI. Without knowing much about fragments, it appears these have more to do with the delivery of pieces of an XML document rather than assembling and validating pieces from multiple documents. In particular, requirement 12 of the XML Fragement Interchange Requirements states that, "Issues involved with the possible "return" of any fragment to its original context and the determination of the possible validity of the "returned" fragment in its original context are beyond the scope of this activity." However, I have no doubt that the fragments project will turn up some interesting ideas about compound documents. In schema languages, the current state of the problem is to generalize the step: %height; That is, to define a general syntax that makes it easy to reuse parts (generally elements and attributes, but possibly any part) of other schemas without bringing in all of the second schema. This may not sound too exciting, but it is very useful. I personally think that anything more utopian than this is going to require, at the very least, a new definition of validity. One such definition was that proposed in this thread: that each subdocument is validated under its own DTD and the overall document is not validated but merely checked for well-formedness. This obviously is a specific case, but interesting nonetheless, as it suggests a useful application for partial validity. (As an aside, anybody figuring out an algorithm by which compound documents such as that shown above are "valid" under multiple DTDs and still work with existing tools would significantly advance the field. Personally, I'm not too hopeful.) So for the moment, don't be disappointed by the lack of answers. You're just ahead of us. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Mon Feb 1 14:16:57 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:08:21 2004 Subject: Another errata? Message-ID: <01BE4DF5.095E8BF0@grappa.ito.tu-darmstadt.de> Tim Bray wrote: > I repeat: in the sense the spec uses the word namespace, an unprefixed > attribute is NOT IN ANY NAMESPACE. I'm happy to live with this interpretation -- it's just that it comes as a complete surprise to me (and apparently to others as well). In this respect, how anybody can read A.2 and determine that prefixed attributes belong to a namespace and unprefixed attributes do not belong to a namespace is beyond me. One very important consequence of this interpretation is that namespace-aware applications need to be sure they don't look for namespace-prefixed local attribute names and namespace-aware SAX and DOM implementations need to be careful that the namespace name passed for local attributes is null. Although it's probably too late, a clarification would be welcome, especially since I can find nothing outside of A.2 that talks about whether attributes (prefixed or unprefixed) belong to a namespace. For example, the first paragraphs of 5.1 and 5.2 state that: "The namespace declaration is considered to apply to the element where it is specified and to all elements within the content of that element..." and: "A default namespace is considered to apply to the element where it is declared (if that element has no namespace prefix), and to all elements with no prefix within the content of that element. ... Note that default namespaces do not apply directly to attributes." This clarifies that default namespaces do not apply to attributes, but does not tell us anything about what does apply to attributes, prefixed or not. I suggest changing the first paragraph of 5.1 to the following: "The namespace declaration is considered to apply to the element **and all prefixed attributes** where it is specified and to all elements **and prefixed attributes** within the content of that element, unless overridden by another namespace declaration with the same NSAttName part. **Namespace declarations do not apply to unprefixed attributes.**" -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Mon Feb 1 14:24:29 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:08:21 2004 Subject: Namespace prefixes in attribute declarations Message-ID: <01BE4DF6.184018E0@grappa.ito.tu-darmstadt.de> According to the namespace spec, the following is legal: What does it mean and why is the prefix legal for the attribute? DTDs have no way to declare global attributes and local attributes are, by definition, unprefixed. Thus, it seems this declares a local attribute that can only be used as a global attribute: The only thing I can think of is so global attributes can be declared in and validated against XML 1.0 DTDs by declaring them on each element. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Feb 1 15:24:44 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:08:22 2004 Subject: SAX: Next Round (Lexical Event Handler) References: <199901202122.QAA00962@megginson.com> <36AA93F3.B98FAE05@jclark.com> <36B21948.5F74CEED@eng.sun.com> <14004.51695.816155.975439@localhost.localdomain> Message-ID: <36B5C88B.C903EF3D@locke.ccil.org> David Megginson scripsit: > Do we need the declarations, or just the boundaries -- or, in other > words, do we need to provide information about declared but unused > external parsed entities? Sorry I'm too lazy to puzzle this out from > the spec right now. The DOM is silent on the matter of completeness in general. In this case we are told that Entity objects model entities, not ENTITY declarations, so presumably a declared but unused entity would not be modeled. > > * expose values of defaults so that the DOM can ensure > > that defaulted attributes always have values; > > The parser should take care of this. > > > * distinguish attributes which were defaulted from those > > that were explicitly in the document. > > Yes, this is necessary, as a few others have also pointed out > (grumble, grumble). Unfortunately these items go together for DOM purposes, something I hadn't noted before. If the application removes an attribute that has a default but didn't use it, it is magically reinserted with the default value. So DOM builders need both the default and the actual value. (This also affects you-know-what.) > Probably -- the problem is that if we extend Parser then we'll have > both a setDocumentHandler and a setLexicalDocumentHandler event, and > that causes some funny problems that I'd rather punt. Not necessarily. setDocumentHandler could simply check if its argument is an instanceof LexicalDocumentHandler, thus making setLexicalDocumentHandler unnecessary. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Feb 1 15:51:21 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:08:22 2004 Subject: Namespace prefixes in attribute declarations References: <01BE4DF6.184018E0@grappa.ito.tu-darmstadt.de> Message-ID: <36B5CED9.DD3AEBBF@locke.ccil.org> Ronald Bourret wrote: > a:my_attribute CDATA #IMPLIED> > > What does it mean and why is the prefix legal for the attribute? DTDs have > no way to declare global attributes [...] Sure they do. There's just no way to declare them once and for all: the declaration must be repeated for every element, as here. Remember that "global attribute" does not mean "universally applicable attribute" but rather "attribute with (by convention) a universal meaning." > The only thing I can think of is so global attributes can be declared in > and validated against XML 1.0 DTDs by declaring them on each element. Just so. One might have a global attribute such as "iso4217:currency" that is applicable only to a few attributes in a particular DTD, those which represent money amounts. But its meaning would be universal: an ISO 4217 currency code. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Mon Feb 1 16:57:46 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:22 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) Message-ID: At 31 January 1999 20:33, W. Eliot Kimber wrote: > [In response to Mark Birbeck] > But you've not solved my problem, because the in-memory > abstraction of the > *document* is still: > > (xml-document > (data-instances > (element > (gi "person") > (content > (element > (gi "name") > (content > (literal "Eliot"))))))) > > So, while the result is closer to the abstraction of the > data, it is still > not the original abstraction. True. But I could have said: if that was a better model of the internal data. Maybe I've missed the subtlety of what you are saying the problem is, but in our system the attributes of an object are exported as described above, and the children of an object are exported as elements within other elements. Seems to me to mirror exactly our object structure - and so far we have been able to re-interpret DTDs back as data definitions. In other words, we *can* generalise the solution. > And note that even for an early-bound form, there are still > infinitely many > ways to construct it I still don't follow your logic - just because there are many ways to construct it, doesn't mean you can't construct it. > So no matter how you slice it, there will always be a > disjoint between the > abstraction of the serialization form and the abstraction of the data > objects being serialized, which means that a query onto the > abstraction of > the serialization will not be the same as a query onto the > abstraction of > the data that has been serialized. The gap might be bigger or > smaller, but > there will always be a gap. Sure. But I still have two issues. First, why would you query the serialisation anyway? Wouldn't you want to query your original database and generate XML pages that reflect the results? Even if you have serialised the data to XML files to speed up the movement of data, you would still want to do searches against the original data. (The nice thing about that - as a little aside - is that you create XML pages that are 'results' pages, ready for the user to drill down through, using whatever super-duper, 3D-helmet, speech-activated interface they have access to.) But second, and I think the main point, I don't understand why you are distinguishing between the XML representation of an object and its serialised form in the way you do? Why not just serialise and de-serialise between XML and the database? I know you ARE doing that, but the XML you are creating is some sort of 'normalised' representation of the original data. You keep talking of the 'abstract' representation of your data, but actually you are *losing* the abstraction, moving from: a person who has the name Eliot to an object which contains another object which has two properties, one set to name and the other set to Eliot Of course both are abstractions, but they model completely different things (data and people). And modelling the data rather than the person means you can no longer interchange your XML with other systems because you have two completely different sets of data, using different DTDs. (And you can't say that your serialisation schema *will* allow this interchange, because although your serialised data may be well-formed, the underlying data it represents may not be, so you need the proper DTD for the object.) > Which begs the question: if the abstraction of the document is not the > abstraction of the data, why bother to create and store the > abstraction of > the document when you can just as easily create and store the > abstraction > of the data? All I am saying is that the document *itself* could be the abstraction of the data. Anyway ... if I've missed the plot then I look forward to your clarification, since we are dealing with similar issues here. Regards, Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Mon Feb 1 17:12:16 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:08:22 2004 Subject: XSL Draft? In-Reply-To: <01BE4E15.32F4B410.paagarwal@hss.hns.com> References: <01BE4E15.32F4B410.paagarwal@hss.hns.com> Message-ID: * parul agarwal | | I am new to XML developers list. I was looking at XSL. Does any one | have an idea on its status? It's a Working Draft, and last I heard it was supposed to become a recommendation (that is, be finalized) this summer. | It was proposed more than a year back. Sure, but it wasn't expected to become a final recommendation until much later. The public drafts have been put out so that people can comment on them, and developing a standard like that takes its time. | Are there existing stylers in the market? Sure: However, any stylesheets you write may well have to change to reflect updates in the spec. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Mon Feb 1 17:16:42 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:08:22 2004 Subject: What is a good database for very large collections? (was Re: XSL/ECMAscript (was RE: Frontier as a scalable XML repository (was Re: Is XML dead already or what?)) Message-ID: <007a01be4e06$da6cb440$60f96d8c@NT.JELLIFFE.COM.AU> From: Simon St.Laurent >This thread(s) has proven more capable of shifting subjects than any I've >seen in a while ... Can I try to shift it back to a vital question asked earlier, but not answered? What is a good database for XML? The criteria are: * over 20, 000, 000 document fragments, each less than 256 characters, each with some flat metadata, able to be incrementally reloaded onto the live system * about simultaneous 30 users accessing about 10 fragments a minute each, grouped together (along with other dynamic data) and transformed, with a high need for immediate response * constant data-mining tools using various adhoc AI and linguitic retrieval software augmenting the metadata in the background. Rick xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Mon Feb 1 17:30:47 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:08:22 2004 Subject: Another errata? References: <01BE4DF5.095E8BF0@grappa.ito.tu-darmstadt.de> Message-ID: <36B5E5DD.6F94FE4E@mecomnet.de> Up until the remark quoted below appeared, I had taken the namespace spec 'with grain of salt', and simply presumed matters would clear up eventually. If matters continue in this direction, however, there is a Ronald Bourret wrote: > > Tim Bray wrote: > > > I repeat: in the sense the spec uses the word namespace, an unprefixed > > attribute is NOT IN ANY NAMESPACE. > > I'm happy to live with this interpretation -- it's just that it comes as a > complete surprise to me (and apparently to others as well). In this > respect, how anybody can read A.2 and determine that prefixed attributes > belong to a namespace and unprefixed attributes do not belong to a > namespace is beyond me. While I could live with the assertion, I would, unfortunately, be unable to write useful software which conformed to it. If an "unprefixed attribute name" is really not in any namespace, then it would be impossible for application code to execute an affirmative comparison against the name, and it would be, for similar reasons, impossible to write xsl patterns which addressed the attribute. Are these consequences really intended? I would be very surprised if they were. An unqualified attribute name may be in a namespace with a unique structure, or in one which has a unique name form, but it should be in some namespace. Otherwise it's not possible to refer to an identifier more than once. I suggest that one take the spec at its word and propose that qualified attribute names are in exactly the namespace which the spec describes, that is, a namespace which has a two part name: the element identifier's uri and the element identifier's local part. This is straight forward. I can even imaging why one might want to do it. One alternative, that they are in the so-called "null" namespace, would be workable, but it contradicts much of the exposition in the spec. (see below for a qualification to this). Another alternative, that they are not in any namespace, means that a name cannot be repeated, which has very limited utility for something intended to be an encoding mechanism. > > One very important consequence of this interpretation is that > namespace-aware applications need to be sure they don't look for > namespace-prefixed local attribute names and namespace-aware SAX and DOM > implementations need to be careful that the namespace name passed for local > attributes is null. Since we've gotten this far, we should also be clear that a namespace with a null name is not identical to a null namespace. The "grain of salt" referred to above, is that I had been presuming that the spec meant the former where the latter appears. Perhaps someone can suggest another interpreation which makes sense. > ... xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Rudyard.Merriam at COMPAQ.com Mon Feb 1 17:38:31 1999 From: Rudyard.Merriam at COMPAQ.com (Merriam, Rudyard) Date: Mon Jun 7 17:08:22 2004 Subject: SNMP Message-ID: Does anyone know of an XML representation for SNMP MIBs? Rud Merriam KD5DTV 281-514-3252 rudyard.merriam@compaq.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Mon Feb 1 17:43:03 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:08:22 2004 Subject: What is a good database for very large collections? In-Reply-To: <007a01be4e06$da6cb440$60f96d8c@NT.JELLIFFE.COM.AU> Message-ID: <001c01be4e09$0955e950$d3228018@jabr.ne.mediaone.net> > > Can I try to shift it back to a vital question asked earlier, but not > answered? > > What is a good database for XML? > > The criteria are: > * over 20, 000, 000 document fragments, each less than 256 > characters, each with some flat metadata, able to be incrementally > reloaded onto the live system > * about simultaneous 30 users accessing about 10 fragments a minute > each, grouped together (along with other dynamic data) and transformed, > with a high need for immediate response How are the fragments selected? By query? If you can easily represent the 20M fragments in tabular form, and if you can easily represent the queries in SQL then a relational db is the way to go. this is not a particularly large, nor high-volume application for RDBMS. Ought you store the 20m fragments each in its own file ... probably not (a big waste). Ought you employ an ODBMS? not unless SQL wouldn't work well (you could always load it into say Oracle/SQL Server/DB2 etc vs. ODI/Poet etc and test it out). My expectation would be that if you need to run queries, the RDB will win. > * constant data-mining tools using various adhoc AI and linguitic > retrieval software augmenting the metadata in the background. > > Rick Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Mon Feb 1 17:43:18 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:08:22 2004 Subject: What is a good database for very large collections? (was ...) In-Reply-To: <007a01be4e06$da6cb440$60f96d8c@NT.JELLIFFE.COM.AU> Message-ID: <199902011739.MAA10416@hesketh.net> (I'm responding in part to reduce the length of our crazy subject header) At 04:18 AM 2/2/99 +1100, Rick Jelliffe wrote: >Can I try to shift it back to a vital question asked earlier, but not >answered? > >What is a good database for XML? > >The criteria are: > * over 20, 000, 000 document fragments, each less than 256 >characters, each with some flat metadata, able to be incrementally >reloaded onto the live system > * about simultaneous 30 users accessing about 10 fragments a minute >each, grouped together (along with other dynamic data) and transformed, >with a high need for immediate response > * constant data-mining tools using various adhoc AI and linguitic >retrieval software augmenting the metadata in the background. Wow! That's quite a set of criteria, and looks almost nothing at all like my criteria, which are more like: * over 20,000 document fragments, ranging in length from 1 to 100,000 characters, all with some metadata, which will remain on the system in mostly stable form. * about 5 simultaneous authors, up to maybe a thousand people reading the information. * indexing and searching moving around in the background. Given these wildly different criteria (and I'm sure others out there have different ideas as well), the concept of a database for XML seems pretty weird. Maybe we should focus on tools for getting information into and out of a repository, and let vendors create different back ends created to match our widely differing needs. That way we can still share tools, and read each other's material, but aren't locked into a particular vendor whose approach won't work for everyone. Simon St.Laurent XML: A Primer / Building XML Applications (March) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Mon Feb 1 17:55:01 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:08:22 2004 Subject: Compound Documents - necessary for success? References: <01BE4DDF.134A6140@grappa.ito.tu-darmstadt.de> Message-ID: <36B5EB8E.972F351D@mecomnet.de> Ronald Bourret wrote: > ... > > 2) When we wrote the new DTD, a *human* made the decision about where > was legal. Anybody figuring out a foolproof way for a machine to > do this usefully -- that is, without defining the content model of all > elements as ANY -- will probably get a Turing Award for AI. > hmm, I've not been following this discussion, but, if one were to first treat the model as ANY - in order to be able to represent the domain, and then to examine the asserted elements, couldn't this be modeled as a straight-forward learning problem? > ... > I personally think that anything more utopian than this is going to > require, at the very least, a new definition of validity. One such > definition was that proposed in this thread: that each subdocument is > validated under its own DTD and the overall document is not validated but > merely checked for well-formedness. Which would require nothing more complicated in the encoding than an attribute to enable/disable validation on an element basis: validation="none" validation="content attributes", "content", "attributes" validation="element" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Mon Feb 1 18:02:08 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:22 2004 Subject: Interesting Monday. References: <5F052F2A01FBD11184F00008C7A4A80001136ADA@eukbant101.ericsson.se> Message-ID: <36B5EAE7.54EC8E70@infinet.com> "Matthew Sergeant (EML)" wrote: > Boy, you people sure can write when something stirs you up... It's 10:10am > and I've only just got through my backlog of XML-Dev mail... > > Well, as the person who introduced the topic "Will XML eat the web?", I feel > I should just add some points of note. I thank everyone who has contributed > to this topic though. > > Firstly, I think there is still an issue with processing power and XML, > although I can see that my system is poorly designed. Time for a rethink... > The area where I can forsee potential problems is in e-commerce. Take an > e-commerce transaction processing company that's moved to an XML transaction > format. They don't have a shop web site, they just process credit card > transactions for other sites. I imagine they are going to need to process > hundreds of transactions per second. I don't for a second suggest that they > store the XML as the primary data format (store it as a backup as suggested > here) - it should immediately be put into an RDBMS. But to do that they have > to parse each transaction. There's no caching that can go on here. There seems to be a major misconception here I think in terms of what software needs to do for businesses. The issue for XML I think should be scalability, not just raw speed. What use is the XML/XSL architecture if it costs mucho deneiro in development dollars for fly-by-night consultants, overpriced databases, and application servers. Maybe some giant bank has money to burn, but the average web-shop if they are even profitable is running their business on very thin margins. Raw performance of using XML is not what should be the focus here, simplicity should be the focus. Using XML for backup or log files is not a bad use of XML. In fact it is a very simple use of XML to do some very simple things. What major benefit is XML other than through some document API like the DOM if it is stored in the DBMS. Are you going to have the DBMS construct an XML stream on the fly which then needs to be reparsed into some in-memory data structure like the DOM to do anything useful with it at the server level. > Luckily that's their problem and not mine . > > My problem was slightly different. I needed to be ready for the 5.0 browsers > (probably IE5, although I'd prefer NS5), and XML seemed ideal because we > would be displaying/editing documents that look like data (or data that > looks like a document if you like). We really needed an object database, but > I needed to get moving quickly (a typical web project: "Can we have it > yesterday"). Learning an object database wasn't a possibility. I already > knew XML. So I looked at it like this - we could have it 2 ways: > > 1) Store XML now, process into HTML now, Transmit XML in the future. Basically to do this you will need in effect a flat-file database. I think a lot of people who will be using XSL will be doing things this way. Just write some XML file by hand that contains your data, apply a stylesheet, and voila! you have HTML. Transmitting XML does no real good unless it is in some document format the browser understands. Even then you will probably want to use XSL to spit content out into this presentation format. This can be done at the server-level or else in the web-browser. > 2) Store in RDBMS now, process into HTML now, process into XML in the > future. > > #1 looked like a nicer solution because it gives performance gains in the > future, which #2 doesn't really (except perhaps XML is a lighter weight > format to transmit than HTML). However this, it appears, is not the right > way to go because RDBMS->*ML is always faster than XML->HTML. That's a > lesson learned, and I thank you for it. Of course this is under the assumption the browser has XSL capabilities and that they are compliant (MS XSL for instance in IE5 is not exactly what you would call compliant XSL at the moment). > Some of the points about caching are great when you're reading 1 XML file > multiple times, but we're talking about 400 - 1000 XML files being accessed > and constantly changed. A nicer solution would be an OODB. It's probably > time to go shopping... Of course. My previous comments were under the assumption that the XML content was pretty much static. Nevertheless, using an OODB will not necessarily give any benefits over an RDBMS. It all depends on your particular business problem. I will let the fight of "whose database is better" up to the DBMS vendors. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Feb 1 18:04:50 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:22 2004 Subject: What is a good database for very large collections? (was Re: XSL/ECMAscript (was RE: Frontier as a scalable XML repository (was Re: Is XML dead already or what?)) Message-ID: <3.0.32.19990201094706.00bab450@pop.intergate.bc.ca> At 04:18 AM 2/2/99 +1100, Rick Jelliffe wrote: >What is a good database for XML? > >The criteria are: > * over 20, 000, 000 document fragments, each less than 256 >characters, each with some flat metadata, able to be incrementally >reloaded onto the live system > * about simultaneous 30 users accessing about 10 fragments a minute >each, grouped together (along with other dynamic data) and transformed, >with a high need for immediate response > * constant data-mining tools using various adhoc AI and linguitic >retrieval software augmenting the metadata in the background. For little fragments like this, you can't possibly (it seems to me) have all that much internal structure, because there's nowhere to put it. Given this, my intuition would be to stuff these puppies into whichever of Oracle/Sybase/Informix was the best fit for my existing installation. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Feb 1 18:05:15 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:23 2004 Subject: Another errata? Message-ID: <3.0.32.19990201100346.00b5d4a0@pop.intergate.bc.ca> At 06:35 PM 2/1/99 +0100, james anderson wrote: >While I could live with the assertion, I would, unfortunately, be unable to >write useful software which conformed to it. If an "unprefixed attribute name" >is really not in any namespace, then it would be impossible for application >code to execute an affirmative comparison against the name, and it would be, >for similar reasons, impossible to write xsl patterns which addressed the >attribute. Are these consequences really intended? The namespace spec gets more over-interpretation than any document this side of the Old Testament. It seems simple to me. 1. There are abstract things called "namespaces", each identified by some URI 2. There is a syntactic mechanism, involving reserved attributes, placeholder tokens, and colon-delimited prefixes 3. The mechanism can be used to map namespaces directly to element types 4. The mechanism can be used to map namespaces directly to attribute names 5. There is a defaulting mechanism to map namespaces to unprefixed element types, purely syntactic sugar The spec says *nothing* normative about the relationship between namespaces, as defined in #1, and an unprefixed attribute. Appendix A provides a scheme which can be used to create a unique identifier for each element type & attr name in a document, using in part the namespace information. If it's any consolation, this very issue was more or less what tied up the WG for months and months. What really crystallizes it is the question, given this: what can you say about the namespace of the href= attribute? The most common and reasonable-sounding answer is, "it's in the namespace of the html:a element". OK, but what does that mean, formally? It turns out to be hard to write down. You could decree that it's in the HTML namespace. A consequence of this is that is identical to which it might be in some circumstances; but always, for all element types and all namespaces? Tough to buy into. Anyhow, the final consensus was that all you can say about the href= above is that it's attached to an element that's in the html namespace. For the purposes of many applications, that might be equivalent to being in the html namespace; but that interpretation isn't compulsory. For places where you want it to be compulsory, go on and prefix the attributes. In my first few implementations of namespace-savvy software, I've had no trouble finding the attributes I needed. What's a scenario that causes problems? -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Feb 1 18:20:47 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:08:23 2004 Subject: Another errata? References: <3.0.32.19990201100346.00b5d4a0@pop.intergate.bc.ca> Message-ID: <36B5F1D2.13073B73@locke.ccil.org> Tim Bray scripsit: > A consequence of this is that > > is identical to > > > which it might be in some circumstances; but always, for all element > types and all namespaces? Tough to buy into. Why is it so tough? If the committee decreed it, fine; but I'd like to see the argument that says the above is implausible. I find it eminently plausible. > In my first few implementations of namespace-savvy software, I've had > no trouble finding the attributes I needed. What's a scenario that > causes problems? -Tim The trouble is, I think, that people would like to "intern" (uniquify) every name in an XML document while parsing it. Element names can be converted to URI^Qname, where ^ is some separator, and interned as such; global attribute names likewise. But it won't do to identify all foo attributes (unqualified) across the entire document, as they may be utterly unrelated. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Feb 1 18:53:54 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:08:23 2004 Subject: NamespaceFilter and ParserFilter updates Message-ID: <36B5F999.FA852119@locke.ccil.org> NamespaceFilter is now updated to handle unprefixed attributes in accordance with the xml-names REC rather than by the old WD formulation. ParserFilter also has a few bugs fixed. The MDSAX team should particularly note these changes. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Mon Feb 1 19:36:35 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:08:23 2004 Subject: Real world DTD Message-ID: <49092BAEAC84D2119B0600805FD40F9F120C69@MDYNYCMSX1> >Or one step up in size, DocBook 3.1 in XML: Discussions on the Davenport list about the possibility of a "DocBook Lite" have ended up concluding that different subsets of it work best for different people, so there's no single "DocBook Lite" out there. Apparently, making one's own customized subset of DocBook is pretty common, so keep that in mind if the full DTD is more than you need. Its modular design makes this kind of subsetting easier. I've used my own "DocBook Very Lite" for examples in writing about SGML, but at 24 element declarations, it's a bit too light for a lot of applications. I use it for taking notes and things; I wrote a perl script that converts Emacs outline format to it so that I can use Norm W's DSSSL style sheets to make nice RTF out of my Emacs outlines. If anyone's interested, see "ol2dbvl" on http://www.snee.com/bob/sgml.html. Bob DuCharme www.snee.com/bob see www.snee.com/bob/xmlann for "XML: The Annotated Specification" from Prentice Hall. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Mon Feb 1 19:41:32 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:08:23 2004 Subject: Another errata? Message-ID: <5BF896CAFE8DD111812400805F1991F708AAEED6@RED-MSG-08> The example is identical to used in earlier mails to this list is less than ideal for the purpose of explaining namespaces on attributes. Let's back up. The A element in HTML is defined in http://www.w3.org/TR/REC-html40/ as It happens that the href attribute in HTML is also defined for a few other element types, for example LINK and BASE, and for these elements it is also CDATA with essentially the same rules and meaning. This might lead one to think that the HREF attribute of an A tag is the same attribute type as the HREF attribute of the LINK tag. One might even go so far as to imagine that both HREF attributes are defined in some global scope, at the same level as elements, in other words, in the "HTML" namespace. The English text of the specification clearly implies that these attributes are to be seen as essentially the same. But the DTD does not say this. The English text and our experience using HTML gives us information not written in the DTD. DTDs today do not have any facility to express that two attributes with the same name, used within different element types, are the same in type or meaning. Within the expressive power of DTD, all we can observe is that the A tag has an attribute named HREF with certain properties. That is, there is an "href" attribute used within the "A" tag. There is also an "href" attribute used within the "LINK" tag. There is also an "href" attribute used within the "BASE" tag. And so forth. They happen to have the same properties. We know that they have the same meaning. But a DTD could equally well have defined an "href" attribute for a "TABLE" element that was a NUMBER and had no relation to the "href" of an "A" element. The DTD can only go so far as to say that an "href" used on an "A" element has certain properties, and that there is also an "href" on a "TABLE" element that has certain other properties. In other words, the definition of an attribute, within the expressive powers of DTD, is relative to its containing element. An unqualified attribute is identified by the triple consisting of the namespace of the element, the name of the element, and the name of the attribute. (See appendix A.3 of the namespaces spec.) This is a long build-up to the conclusion that an unqualified attribute cannot be presumed to come from the same namespace as its element, without qualification. Rather, the element provides a local namespace for its attributes. So is it true that is identical to ? There is nothing in a DTD that would allow us to decide that they are identical. They are no more the same than if the HTML DTD defined another element called "ANCHOR" and the English spec said that it has exactly the same meaning as the "A" element. We might, reading the English spec, know that they are the same. But qua DTD, they are different. The namespaces specification is designed to support the current practice of DTDs, and so defines unqualified attributes in this way, while leaving the door open to future forms of schema that express richer and more-global attribute scopes. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Mon Feb 1 21:05:22 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:08:23 2004 Subject: NamespaceFilter and ParserFilter updates Message-ID: <002101be4e25$f2695b60$c9a8a8c0@thing2> >The MDSAX team should particularly note these changes. Thanks John. We have your namespace and inheritance filters working with MDSAX now, and will be sure to include your updates in the next release. FYI, here's the configuration document we are using to test inheritance. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Mon Feb 1 21:54:10 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:08:23 2004 Subject: NamespaceFilter and ParserFilter updates Message-ID: <000a01be4e2c$ce497240$c9a8a8c0@thing2> John, Your new namespace filter looks good, though I must admidt I'm still finding the output a bit strange. Thanks to Paul for integrating namespace and MDSAX. Bill Before: After: xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From deke at tallent.com Mon Feb 1 22:10:49 1999 From: deke at tallent.com (Deke Smith) Date: Mon Jun 7 17:08:23 2004 Subject: Frontier as a scalable XML repository (was Re: Is XML dead already or what?) Message-ID: <1294218445-103751506@server2.tallent.com> As someone who uses Frontier with XML I will try to make a brief outline of my experiences (I can get a little verbose sometimes). Frontier is a good example of how some people use XML in a PRODUCTION ENVIRONMENT. All the users of the current version use XML whether they know it or not. As part of a subscription to Frontier, the user is provided constant updates for bug fixes and new features. Those updates are provided via HTTP using XML. The documentation can be downloaded and referenced from within the program and that, too, is updated the same way. Both of these uses of XML are invisible to the end user... I think many people who are currently using Frontier as a processor for XML would agree with this: Tyler Baker, tyler@infinet.com said on 2/1/99 11:56 AM: >There seems to be a major misconception here I think in terms of what >software >needs to do for businesses. The issue for XML I think should be >scalability, not >just raw speed. What use is the XML/XSL architecture if it costs mucho >deneiro >in development dollars for fly-by-night consultants, overpriced databases, >and >application servers. Maybe some giant bank has money to burn, but the >average >web-shop if they are even profitable is running their business on very thin >margins. Raw performance of using XML is not what should be the focus here, >simplicity should be the focus. In a production environment where expediency is important as well as flexibility and reuseability, *TODAY* XSL is not the answer. If I am not mistaken, XSL is still in working draft. I have based work I have done on a working draft before and had to redo it later to be reuseable. There is a need, and tools like the Xmltr Suite have filled the vacuum in the Frontier environment. I have tried to see how much XML Frontier can munch on and used the religion xml files as a stress test. Last time I tried (three months ago) Frontier could not parse that quantity of text at one time. It coughed a hair ball about a quarter of a way through it. As far as being a repository for terabytes of information...that is an interesting question and one that sounds intriquing enough to try. Parsing huge files may be a problem as I mentioned before. Although most of Frontier's information is stored in the main database of the program called the 'root', 'guest' databases can be created as needed. Unfortunately I don't have a terabyte disk drive nor do I think I can scrounge up a terabyte worth of information on my own to try. I search for XML data once it is inside of Frontier by creating indices. Right now, indices are pretty much a roll your own proposition. I have a database of approximately 1,000 contacts from around the world in an XML file. That way, it can be edited by volunteers who have no computer experience on Mac, Windows, Unix, BeOS, etc. I can bring this data into Frontier and create a "Yahoo-style" directory from the information with about a hundred pages. I also can replicate the directory in Spanish and German using XML-based dictionaries to translate geographic names and interface elements. After creating my scripts, it takes me about thirty minutes to bring the revised listing into Frontier and get it started. A couple of hours later it is done and the Website online has been updated via FTP. Deke ----------------------------------------------------------------- Deke Smith Tallent Communications Group, Brentwood TN deke@tallent.com, 615-661-9878 "Cats are smarter than dogs. You can't get eight cats to pull a sled through snow." - Jeff Valdez ----------------------------------------------------------------- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jamesr at steptwo.com.au Mon Feb 1 23:17:05 1999 From: jamesr at steptwo.com.au (James Robertson) Date: Mon Jun 7 17:08:23 2004 Subject: Interesting Monday. In-Reply-To: <5F052F2A01FBD11184F00008C7A4A80001136ADA@eukbant101.ericss on.se> Message-ID: <4.1.19990202100959.00c80910@steptwo.com.au> At 20:29 1/02/1999 , Matthew Sergeant (EML) wrote: | My problem was slightly different. I needed to be ready for the 5.0 browsers | (probably IE5, although I'd prefer NS5), and XML seemed ideal because we | would be displaying/editing documents that look like data (or data that | looks like a document if you like). We really needed an object database, but | I needed to get moving quickly (a typical web project: "Can we have it | yesterday"). Learning an object database wasn't a possibility. I already | knew XML. So I looked at it like this - we could have it 2 ways: | | 1) Store XML now, process into HTML now, Transmit XML in the future. | | 2) Store in RDBMS now, process into HTML now, process into XML in the | future. I would personally recommend a third option: 3) Store in RDBMS now, process into XML, process this into HTML now. Process the XML into whatever you want in the future. I have been using this in an electronic publishing system, and while it seems like overkill, it isn't. It both makes the generation of HTML easier, and inserts a very nice level of abstraction into the whole system. For example, you can change the structure of the RDBMS without having to worry about the HTML, etc, as long as the XML DTD is still valid. And if you want to generate paper, online help, etc it is much easier from the XML. Cheers, J ------------------------- James Robertson Step Two Designs Pty Ltd SGML, XML & HTML Consultancy http://www.steptwo.com.au/ jamesr@steptwo.com.au "Beyond the Idea" ACN 081 019 623 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Mon Feb 1 23:56:12 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:23 2004 Subject: Another errata? Message-ID: There seems to be a lot of confusion here. A few interesting points have been made, but I think the defence of the namespaces spec has not been made strongly enough. To recap: some people are querying as to why the namespaces spec does not allow attributes that do not have a namespace specified to belong to a default namespace. Elements with no namespace can belong to a default one, they argue, so why can't attributes? Others in the discussion have said, well, even if we accept this, why doesn't the attribute at least belong to the same namespace as its element? There are two confusions here: - the first relates to default namespaces, and why their application to attributes would be a problem, not a help - the second relates to a misunderstanding as to what role namespaces play anyway Let's look at the default namespace question first. As Ronald said on this thread - quoting from the namespaces spec - default namespaces do NOT apply to attributes (or not *directly* anyway). To see why this is necessary, imagine for a moment that they did. What would the following document give us (picking up on an example from the namespaces spec)?: Cheaper by the Dozen In this document 'isbn' would be part of the HTML 4.0 namespace - not what is intended at all! Now, everyone out there that's moaning - do you really want to have to prefix every attribute with the namespace of its element, to ensure it is not confused with a default namespace? OK then, some have argued, at least shouldn't 'isbn' automatically be part of the 'bk' namespace? Still no, I'm afraid. Every member of a namespace is meant to be unique. If we do this 'auto-joining' we cannot guarantee uniqueness. Let's extend the above example (having now hopefully accepted that 'isbn' is *not* part of HTML 4.0) to include another object which is probably not part of the 'urn:loc.gov:books' namespace, but bear with me: Cheaper by the Dozen A list of loads of books In this case, 'auto-joining' an attribute to its element's namespace would make 'isbn' into a *global* attribute. Handy, if you wanted to process all bk:isbn numbers in a document - but wrong!! In this document we do NOT have two instances of bk:isbn, we have one of bk:book:isbn and one of bk:catalogue:isbn. So after all that, what namespace is it in? Taking on board the points in section A.2, 'isbn' actually occurs twice, once in the per-element type partition for bk:book, and again for bk:catalogue. But now, unlike before, each is guaranteed to be unique, which is what namespaces are for! On to the second issue. I think some of the confusions have arisen by mixing up namespaces with validation. All namespaces do is give you a syntax for ensuring that elements and attributes are unique. In our examples above, we don't want to ensure that 'isbn' is unique within the entire document, we just want it to be unique within 'book' and 'catalogue'. But XML already ensures that, so we don't actually need an explicit namespace. A parser already has all the information it needs about the attribute from its context within an element. If an application processing the nodes in the document had to act differently when processing 'isbn' fields that come from inside the company, versus processing the public ones; and further, if 'catalogue' is defined inside our company, but 'book' is public; then, in the following document we already have ALL the information we need: Cheaper by the Dozen A list of loads of books The parser will already give us all the information we need to process the two 'isbn' values differently, even though neither are prefixed with a namespace! We don't need one! So hopefully now you can see why this is unnecessary: y this is OK: y and this is not good practice at all (see Andrew Layman's comments on this, too): y Regards, Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Tue Feb 2 00:07:31 1999 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:08:23 2004 Subject: SAX: Next Round (Lexical Event Handler) References: <199901202122.QAA00962@megginson.com> <36AA93F3.B98FAE05@jclark.com> <36B21948.5F74CEED@eng.sun.com> <14004.51695.816155.975439@localhost.localdomain> Message-ID: <36B64057.205FCFB0@eng.sun.com> David Megginson wrote: > > David Brownell writes: > > > > > I haven't checked, but I think that this gives us everything we need > > > > for DOM level one. > > > > Doesn't quite ... there's some more DTD information needed to: > > > > * ensure that PIs within the DTD (e.g. internal subset) > > don't show up anywhere in the DOM tree (ugh); > > You can determine this using the start/end DTD events and start/end > entity events, I think. Seemed like the start/end DTD event was for the external subset though. Sun's interface works that way, so this can be done given a real API description ... :-) > > * see declarations of external general entities; > > Do we need the declarations, or just the boundaries -- or, in other > words, do we need to provide information about declared but unused > external parsed entities? Sorry I'm too lazy to puzzle this out from > the spec right now. Actually I misspoke: It's not DOM that needs to see the declarations, it's the namespace spec which places a constraint that entity names be colon-free. (As noted, I was assuming namespaces should be layerable.) > > * expose values of defaults so that the DOM can ensure > > that defaulted attributes always have values; > > The parser should take care of this. Only if DOM and the parser are joined at the hips -- since when you remove a defaultable attribute from a DOM element, the DOM must then restore that value to its default value. (John Cowan also noted this.) > > * distinguish attributes which were defaulted from those > > that were explicitly in the document. > > Yes, this is necessary, as a few others have also pointed out > (grumble, grumble). > > > (In addition the above, if XML namespaces are to be layerable over > > a normal XML 1.0 parser, declarations of all other entities need to > > be exposed so they can be examined for conformance: they must not > > contain colons!) > > This is probably overkill for SAX -- if someone wants to layer > namespaces on top of SAX, they'll have to miss this one. ... or add new interfaces! > > > I wonder whether LexicalHandler ought to extend DocumentHandler. The > > > events it reports are synchronous with the events reported by > > > DocumentHandler. It seems to me that applications are always going to > > > want to implement either DocumentHandler or both DocumentHandler and > > > LexicalHandler. > > Probably -- the problem is that if we extend Parser then we'll have > both a setDocumentHandler and a setLexicalDocumentHandler event, and > that causes some funny problems that I'd rather punt. What I did is effectively group the two: if you set one, you set the other. One can always argue purity of essence, but that seemed to be the most useful choice for applications. - Dave > All the best, > > David > > -- > David Megginson david@megginson.com > http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Tue Feb 2 00:47:38 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:23 2004 Subject: Since we're talking about databases... Message-ID: <3.0.32.19990201164621.00a5e360@pop.intergate.bc.ca> One reason that I've been particularly interested in these threads is that I've been doing a bunch of work for a German company named Software AG. Those who have some grey hairs will remember a database package named "ADABAS" that used to be hot stuff in pre-relational days (still is pretty hot stuff, but marginalized due to being non-relational). Anyhow, they've been working on a REALLY BIG REALLY FAST xml data store, and have actually started talking a little bit about this in public: check out http://www.softwareag.com/corporat/press/jan99/e-cebit.htm Pretty thin information as yet, but people who are interested in wrangling XML en masse should keep an eye on these guys. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From harvey at eccnet.eccnet.com Tue Feb 2 01:14:28 1999 From: harvey at eccnet.eccnet.com (Betty L. Harvey) Date: Mon Jun 7 17:08:23 2004 Subject: Since we're talking about databases... In-Reply-To: <3.0.32.19990201164621.00a5e360@pop.intergate.bc.ca> Message-ID: On Mon, 1 Feb 1999, Tim Bray wrote: > One reason that I've been particularly interested in these threads is that > I've been doing a bunch of work for a German company named Software AG. > Those who have some grey hairs will remember a database package named > "ADABAS" that used to be hot stuff in pre-relational days (still is > pretty hot stuff, but marginalized due to being non-relational). I don't remember ADABAS but I remember a database called S2K (which was also pre-relational). S2K was a hierarchical database. I worked with S2K a few years ago (many of the individuals on this list were probably still in diapers). I have often thought that a database like S2K with XML would be pretty powerful We ran a Personnel and Accounting Control System (PACS) that managed all the H.R. and Accounting information for 10,000 employees. Back then (in the dark ages) XML wasn't invented yet. I haven't heard of S2K applications in years but I have often thought that the hierarchical database model (one that isn't dependent on COBOL |-) and XML would be a pretty powerful team and the right tool for the job. I look forward to hearing more about ADABAS. Betty /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/ Betty Harvey | Phone: 301-540-8251 FAX: 4268 Electronic Commerce Connection, Inc. | 13017 Wisteria Drive, P.O. Box 333 | Germantown, Md. 20874 | harvey@eccnet.eccnet.com | Washington,DC SGML/XML Users Grp URL: http://www.eccnet.com | http://www.eccnet.com/sgmlug/ /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\\/\/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Tue Feb 2 01:33:27 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:23 2004 Subject: Since we're talking about databases... References: <3.0.32.19990201164621.00a5e360@pop.intergate.bc.ca> Message-ID: <36B65517.8FAE4EC5@manhattanproject.com> I'm rather new to the list. I was wondering if there is any current (open source?) work for storing XML in relational databases. I was thinking of using PostgreSQL. Thoughts: (I've read almost everything on Robin Cover's site) a) I've been reading everything I can on XML and see a big opportunity to have a standard based upon formal systems. The forest automata papers are expecially exciting, now I regret goofing-off during computational complexity... Anyway, I'd love to see/learn more on this aspect of XML before I jump head first into coding. Any more pointers? b) The XML Archetectures seems to be the right way to go from my understanding of the readings, thus any database would have to support this. Is this the concensus or are their vaired opinions? c) GROVES look pretty cool. Is there anyone working on a xml property set? Would it be smart to implement an xml database upon the groves abstraction, or would a direct (DOM) based implemention be better? d) How hard would it be to develop a transition/rewrite engine (say by modifying JADE) to work using a database store instead of memory? Thoughts? Any work in this area? e) Is there any thoughts on a XML transport layer (instead of using CORBA), based upon SMTP and POP3 or equivalent? I was thinking more along the lines of LISTSERV, where message logistics are handled in a smart way. f) Has anyone proposed a standard SQL 92 RDBMS mapping? :) Clark Evans xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Tue Feb 2 01:50:24 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:08:23 2004 Subject: Since we're talking about databases... In-Reply-To: <36B65517.8FAE4EC5@manhattanproject.com> Message-ID: <000601be4e4e$003a7800$5118a8c0@kuantech1.quokka.com> There is work on XML-based transport using HTTP: the WebBroker NOTE from W3C. As far as XML and databases are concerned, I believe Oracle has integrated an XML parser into their RDBMS product. I don't know any more details than that. Jeff Sussna -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of Clark Evans Sent: Monday, February 01, 1999 5:30 PM To: xml-dev@ic.ac.uk Subject: Re: Since we're talking about databases... I'm rather new to the list. I was wondering if there is any current (open source?) work for storing XML in relational databases. I was thinking of using PostgreSQL. Thoughts: (I've read almost everything on Robin Cover's site) a) I've been reading everything I can on XML and see a big opportunity to have a standard based upon formal systems. The forest automata papers are expecially exciting, now I regret goofing-off during computational complexity... Anyway, I'd love to see/learn more on this aspect of XML before I jump head first into coding. Any more pointers? b) The XML Archetectures seems to be the right way to go from my understanding of the readings, thus any database would have to support this. Is this the concensus or are their vaired opinions? c) GROVES look pretty cool. Is there anyone working on a xml property set? Would it be smart to implement an xml database upon the groves abstraction, or would a direct (DOM) based implemention be better? d) How hard would it be to develop a transition/rewrite engine (say by modifying JADE) to work using a database store instead of memory? Thoughts? Any work in this area? e) Is there any thoughts on a XML transport layer (instead of using CORBA), based upon SMTP and POP3 or equivalent? I was thinking more along the lines of LISTSERV, where message logistics are handled in a smart way. f) Has anyone proposed a standard SQL 92 RDBMS mapping? :) Clark Evans xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at dns.isogen.com Tue Feb 2 04:30:55 1999 From: eliot at dns.isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:08:23 2004 Subject: Since we're talking about databases... In-Reply-To: <36B65517.8FAE4EC5@manhattanproject.com> References: <3.0.32.19990201164621.00a5e360@pop.intergate.bc.ca> Message-ID: <3.0.5.32.19990201214853.009c9b10@amati.techno.com> At 01:29 AM 2/2/99 +0000, Clark Evans wrote: >I'm rather new to the list. I was wondering if there is any >current (open source?) work for storing XML in relational >databases. I was thinking of using PostgreSQL. >c) GROVES look pretty cool. Is there anyone working >on a xml property set? Would it be smart to implement >an xml database upon the groves abstraction, or would >a direct (DOM) based implemention be better? David Megginson's Work Group is charged with developing the official abstract data model for XML--but I doubt it will be defined as a property set using the PSDR approach. Defining a grove plan over the SGML property set that reflects what XML gives you is pretty trivial and adding some properties for XML-specific stuff (dare I say it, namespaces?) wouldn't be too hard. TechnoTeacher (www.techno.com) is pursing large-scale grove-based systems. We've certainly talked about relational vs object approaches. They'll probably have to do both eventually. I'm in the process of rewriting my PHyLIS grove manager/HyTime engine in Python. As there's an RDBMS package for Python, I'm thinking about trying to put it under PHyLIS as a persistent grove store. Not sure how it will work, but it will be easy to experiment with. >d) How hard would it be to develop a transition/rewrite >engine (say by modifying JADE) to work using a database >store instead of memory? Thoughts? Any work in >this area? Shouldn't be too hard. One way to do it is to simply put a grove API over a database, so that the grove is really just a bunch of indirections to the real data. This is the way I've designed my GroveNode class for PHyLIS, to enable this sort of grove binding. There's probably a more efficient way to do this from a CS standpoint, but it seemed pretty easy to implement. I wouldn't be surprised if Alex Milowski, of Copernican Solutions and Veo Systems, hasn't done some of this stuff-he put a lot of effort into building a completely grove-based server system. Cheers, E. --
W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From CBerry at works.com Tue Feb 2 05:14:23 1999 From: CBerry at works.com (Chris Berry) Date: Mon Jun 7 17:08:24 2004 Subject: No subject Message-ID: <958E41703996D21197A200A0C9D4C65672B1@AUS-SERVER4> Greetings, I am a relative newbie to XML, so please bear w/ me if this is a poor question... It looks like everything in the DOM must be dealt w/ in the context of a particular document. It seems you cannot drop one Document into another, or even one DocumentFragment into another Document context. I have tried to do the following... (using Microsoft's implementation of the DOM) IXMLDOMDocument doc = (IXMLDOMDocument) new DOMDocument(); IDOMDocumentFragment docFrag = doc.createDocumentFragment(); IDOMElement userElem = doc.createElement("USER"); userElem.setAttribute("TIMESTAMP", new Variant("123456") ); docFrag.appendChild( userElem ); doc.appendChild( docFrag ); System.out.println( doc.getXml() ); // >>>> Yields the correct results IXMLDOMDocument doc2 = (IXMLDOMDocument) new DOMDocument(); doc2.appendChild( docFrag ); System.out.println( doc2.getXml() ); // >>>> Yields an empty string It seems like this should work to me. Shouldn't we be able to cut and paste between documents?? Or more important, shouldn't we be able to build up documents compositionally?? I.e. compose Address 1 and Address 2 as DocumentFragments (or their own Documents -- with their own DTDs) and then drop these into, say, User (providing that the User DTD agrees). I must be missing something here. Thanks in advance, Cheers, -- Chris Chris Berry cberry@works.com 512-231-1341 works.com 6850 Austin Center Blvd. Austin, TX 78731 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From da at cp.net Tue Feb 2 05:19:13 1999 From: da at cp.net (da) Date: Mon Jun 7 17:08:24 2004 Subject: How XML Just Might Eat the Web Someday (was Re: Interesting Monday.) In-Reply-To: <5F052F2A01FBD11184F00008C7A4A80001136ADA@eukbant101.ericss on.se> Message-ID: Hi there, Well, I've been off the list for a while in the middle of switching jobs. But, oddly enough, some opportunities to use XML have come up at my new gig as well. Whether this is more because I've got a hammer and everything looks like a nail or a reflection of the actual usefulness of XML remains to be seen, but I have a feeling it's the latter. One note regarding that: I've downloaded gecko and it's great, it looks like it's going to do exactly what I want to do with it. Very cool. I know the focus of this list isn't really interface stuff, but I can't pass a thread with a title like that up without dropping in my $2E-2. If you go to http://www.vizbang.com/ and click on the one link there that isn't a mailto, you'll find a whitepaper that I wrote about a year ago on 3D representations of XML and published under the aegis of the VRML consortium. I'm working with a designer right now on doing up an entire personal site using this technology, so there will be a good demo soon, I hope. (It's been slowed down a lot by having a new job that I like and living in the most amazing city on earth...sigh.) But the basic, admittedly wacky and either ahead of it's time or unimplementable in a scaleable way idea is to render links, to use 3D to show the relationships between chunks of information. You could see how big a site is. You could see where it you've been and where you can go. You could make your own links. I'm excited about the navigational improvements it could make to the web in the short term, but the longer term application that I'm even more excited about is for collaborative document authoring. The whitepaper isn't exactly a classic of technical literature, but it does have most of the stuff I've been thinking about, and soon there will be a demo. I've got some reasons to think it might happen, eventually. A number of very knowledgeable folks from the web design, XML and VRML communities have said they think it's pretty hot stuff, and at least one person who is active on this list has told me that he tried to do what I'm trying to do and it's impossible. The former response has given me hope, but the latter response has admittedly been more useful in keeping me motivated (which may very well have been why that person wrote it, for all I know). /da At 11:29 AM 2/1/99 +0100, Matthew Sergeant (EML) wrote: >Boy, you people sure can write when something stirs you up... It's 10:10am >and I've only just got through my backlog of XML-Dev mail... > >Well, as the person who introduced the topic "Will XML eat the web?", I feel >I should just add some points of note. I thank everyone who has contributed >to this topic though. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Tue Feb 2 05:56:59 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:08:24 2004 Subject: Another errata? Message-ID: <005101be4e70$ab37d820$2ee044c6@arcot-main> >The namespace spec gets more over-interpretation than any document this >side of the Old Testament. It seems simple to me. I beg to differ. The namespace spec is short and hard to read, difficult to understand, and leaves the reader with many unanswered questions. Answers might have been in the spec in the form of some carefully chosen sequence of words but they were not apparent. It is also lumpy in terms of information density which has similar effect as being questioned by Colombo: easy sentences mixed with hard logic ambushes. Considering that namespaces affect most XML developers, this is truely unfortunate. If the spec seems simple to you, that is because you are one of the authors. My evidence for its failure to satisfy the readers is in the very confusions and over-interpretations you have noted. Like the majority of the XML developers, I have tremendous respect for the work you have done, especially the XML spec. I do hope you will consider what can be done to remedy the situation rather than trying to defend the namespace spec. Perhaps the Annotated Namespaces in XML will do the job if it fully discusses the implications and applications of namespaces. Perhaps not. We, the slaves of the standard specs which seemingly breeds like rabbits, have very little control over these matters without getting pissed off as a whole. We are at your mercy. Help us if you can. Best, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Tue Feb 2 09:05:46 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:08:24 2004 Subject: Interesting Monday. Message-ID: <5F052F2A01FBD11184F00008C7A4A80001136ADE@eukbant101.ericsson.se> > -----Original Message----- > From: Tyler Baker [SMTP:tyler@infinet.com] > Sent: Monday, February 01, 1999 5:57 PM > To: Matthew Sergeant (EML) > Cc: 'xml-dev@ic.ac.uk' > Subject: Re: Interesting Monday. > > "Matthew Sergeant (EML)" wrote: > > > Boy, you people sure can write when something stirs you up... It's > 10:10am > > and I've only just got through my backlog of XML-Dev mail... > > > > Well, as the person who introduced the topic "Will XML eat the web?", I > feel > > I should just add some points of note. I thank everyone who has > contributed > > to this topic though. > > > > Firstly, I think there is still an issue with processing power and XML, > > although I can see that my system is poorly designed. Time for a > rethink... > > The area where I can forsee potential problems is in e-commerce. Take an > > e-commerce transaction processing company that's moved to an XML > transaction > > format. They don't have a shop web site, they just process credit card > > transactions for other sites. I imagine they are going to need to > process > > hundreds of transactions per second. I don't for a second suggest that > they > > store the XML as the primary data format (store it as a backup as > suggested > > here) - it should immediately be put into an RDBMS. But to do that they > have > > to parse each transaction. There's no caching that can go on here. > > There seems to be a major misconception here I think in terms of what > software > needs to do for businesses. The issue for XML I think should be > scalability, not > just raw speed. What use is the XML/XSL architecture if it costs mucho > deneiro > in development dollars for fly-by-night consultants, overpriced databases, > and > application servers. Maybe some giant bank has money to burn, but the > average > web-shop if they are even profitable is running their business on very > thin > margins. Raw performance of using XML is not what should be the focus > here, > simplicity should be the focus. Using XML for backup or log files is not > a bad > use of XML. In fact it is a very simple use of XML to do some very simple > things. What major benefit is XML other than through some document API > like the > DOM if it is stored in the DBMS. Are you going to have the DBMS construct > an XML > stream on the fly which then needs to be reparsed into some in-memory data > structure like the DOM to do anything useful with it at the server level. > I think you've missed my point all together here (correct me if I'm wrong). Much e-commerce in the future is going to be done using XML. For e-commerce to work you have order-generators (the web-shop) and transaction processors (the credit card company, or someone who processes the order). Generally these aren't the same company. The transaction processing companies will have their work cut out for them performing hundreds of transactions a second using XML. That's all I'm saying. How they get around that is their business. (it might even be my business in the near future, just not now). Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Tue Feb 2 09:12:00 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:08:24 2004 Subject: Interesting Monday. Message-ID: <5F052F2A01FBD11184F00008C7A4A80001136ADF@eukbant101.ericsson.se> > -----Original Message----- > From: James Robertson [SMTP:jamesr@steptwo.com.au] > > At 20:29 1/02/1999 , Matthew Sergeant (EML) wrote: > > | My problem was slightly different. I needed to be ready for the 5.0 > browsers > | (probably IE5, although I'd prefer NS5), and XML seemed ideal because > we > | would be displaying/editing documents that look like data (or data > that > | looks like a document if you like). We really needed an object > database, but > | I needed to get moving quickly (a typical web project: "Can we have it > | yesterday"). Learning an object database wasn't a possibility. I > already > | knew XML. So I looked at it like this - we could have it 2 ways: > | > | 1) Store XML now, process into HTML now, Transmit XML in the future. > | > | 2) Store in RDBMS now, process into HTML now, process into XML in the > | future. > > I would personally recommend a third option: > > 3) Store in RDBMS now, process into XML, process this into HTML now. > Process the XML into whatever you want in the future. > > Nonononono. :) This generates probably 5% more overhead than I have already (the RDBMS). XML doesn't parse quickly (well, OK, it parses quickly, but not compared to reading data from an RDBMS). When you are processing tens of XML files per second this becomes a huge problem. Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Rschoening at unforgettable.com Tue Feb 2 09:26:52 1999 From: Rschoening at unforgettable.com (Rob Schoening) Date: Mon Jun 7 17:08:24 2004 Subject: Interesting Monday. In-Reply-To: <5F052F2A01FBD11184F00008C7A4A80001136ADE@eukbant101.ericsson.se> References: <5F052F2A01FBD11184F00008C7A4A80001136ADE@eukbant101.ericsson.se> Message-ID: <000342de26f5e966_mailit@mail.ptld.uswest.net> >The transaction processing >companies will have their work cut out for them performing hundreds of >transactions a second using XML. That's all I'm saying. How they get around >that is their business. (it might even be my business in the near future, >just not now). I don't think this is such a big deal. It seems like it would be trivial to load-balance the XML component of the server. There would be a slight increase in latency to account for XML parsing, but it wouldn't drag the database. It's not as if the XML processing must be done in lock-step with the transaction processing. Rob > > Matt. > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Feb 2 11:59:35 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:24 2004 Subject: SAX: Next Round (Lexical Event Handler) In-Reply-To: <36B64057.205FCFB0@eng.sun.com> References: <199901202122.QAA00962@megginson.com> <36AA93F3.B98FAE05@jclark.com> <36B21948.5F74CEED@eng.sun.com> <14004.51695.816155.975439@localhost.localdomain> <36B64057.205FCFB0@eng.sun.com> Message-ID: <14006.59356.214061.86210@localhost.localdomain> David Brownell writes: > Seemed like the start/end DTD event was for the external subset > though. Sun's interface works that way, so this can be done > given a real API description ... :-) I had intended it for the whole DOCTYPE rather than just the external DTD subset -- the external subset would be delimited by a start/end entity event with the pseudo-entity '[dtd]'. > Only if DOM and the parser are joined at the hips -- since when you > remove a defaultable attribute from a DOM element, the DOM must then > restore that value to its default value. (John Cowan also noted this.) Yes, I understand the problem now. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Feb 2 12:03:54 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:24 2004 Subject: Since we're talking about databases... In-Reply-To: <36B65517.8FAE4EC5@manhattanproject.com> References: <3.0.32.19990201164621.00a5e360@pop.intergate.bc.ca> <36B65517.8FAE4EC5@manhattanproject.com> Message-ID: <14006.59502.362962.284178@localhost.localdomain> Clark Evans writes: > I'm rather new to the list. I was wondering if there is any > current (open source?) work for storing XML in relational > databases. I was thinking of using PostgreSQL. If you want to start this yourself, you probably want to code in Java to vanilla JDBC, for two reasons: 1. Most decent XML tools are in Java -- it's a nice turn-around to see people asking "is this tool available in C++" rather than "is this tool available in Java"? Of course, Perl is catching up fast... (I use Perl for quick batch processing scripts and Java for larger applications). 2. JDBC will let people put exactly as much power as they need on the back end, from MSQL or PostgreSQL (or, if they're masochists, Access) for light-weight work, to Oracle, Sybase, DBII, etc. for heavy-grade stuff. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Tue Feb 2 12:59:00 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:08:24 2004 Subject: Namespaces Message-ID: <36B6F365.24E4E19@jclark.com> I have been disturbed by the amount of confusion surrounding the XML Namespaces Recommendation. So I have written a document http://www.jclark.com/xml/xmlns.htm that tries to explain the mechanism specified by the XML Namespaces Recommendation. It explains things in a somewhat different way which I hope at least some people may find less confusing than the explanation in the Recommendation. Constructive suggestions for improvement are welcome. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Tue Feb 2 13:10:03 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:08:24 2004 Subject: Namespaces without a domain name Message-ID: <5F052F2A01FBD11184F00008C7A4A80001136AE1@eukbant101.ericsson.se> Can you do namespaces if you don't have a domain name? (I do, but I was thinking others might not, e.g. individuals rather than companies). This might be answered by James Clark's document which I just started reading, if it is, then I apologise. Matt. -- http://come.to/fastnet Perl on Win32, PerlScript, ASP, Database, XML GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Tue Feb 2 13:43:42 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:08:25 2004 Subject: Namespaces without a domain name In-Reply-To: <5F052F2A01FBD11184F00008C7A4A80001136AE1@eukbant101.ericss on.se> Message-ID: <199902021343.IAA29829@hesketh.net> At 02:08 PM 2/2/99 +0100, Matthew Sergeant (EML) wrote: >Can you do namespaces if you don't have a domain name? > >(I do, but I was thinking others might not, e.g. individuals rather than >companies). > >This might be answered by James Clark's document which I just started >reading, if it is, then I apologise. You can always get a PURL (see http://purl.oclc.org), or use some URL that you have control over, even if its not a complete domain. DDML uses PURLs. Simon St.Laurent XML: A Primer / Building XML Applications (March) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Tue Feb 2 14:24:28 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:08:25 2004 Subject: Namespaces without a domain name Message-ID: <5F052F2A01FBD11184F00008C7A4A80001136AE4@eukbant101.ericsson.se> > -----Original Message----- > From: Simon St.Laurent [SMTP:simonstl@simonstl.com] > > At 02:08 PM 2/2/99 +0100, Matthew Sergeant (EML) wrote: > >Can you do namespaces if you don't have a domain name? > > > >(I do, but I was thinking others might not, e.g. individuals rather than > >companies). > > > >This might be answered by James Clark's document which I just started > >reading, if it is, then I apologise. > > You can always get a PURL (see http://purl.oclc.org), or use some URL that > you have control over, even if its not a complete domain. DDML uses > PURLs. > Actually, now I think about it, you could use a geocities, or surf.to or any other type of free account. Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Feb 2 14:48:42 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:08:25 2004 Subject: Another errata? References: Message-ID: <36B70FE4.B38B0E13@locke.ccil.org> Mark Birbeck wrote: > OK then, some have argued, at least shouldn't 'isbn' automatically be > part of the 'bk' namespace? Still no, I'm afraid. Every member of a > namespace is meant to be unique. Well, not quite. An element can have the same name as a global attribute without problem. There is a contradiction, AFAI can see, between clause 5.3 and Appendix A, and obviously 5.3 is normative. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Feb 2 14:53:27 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:08:25 2004 Subject: Another errata? References: Message-ID: <36B71137.8E4D145D@locke.ccil.org> Mark Birbeck wrote: > OK then, some have argued, at least shouldn't 'isbn' automatically be > part of the 'bk' namespace? Still no, I'm afraid. Every member of a > namespace is meant to be unique. Well, not quite. An element can have the same name as a global attribute without problem. There is a contradiction, AFAI can see, between clause 5.3 and Appendix A, and obviously 5.3 is normative. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Tue Feb 2 15:15:11 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:08:25 2004 Subject: Namespaces without a domain name Message-ID: <005601be4ebf$12186020$2ff96d8c@NT.JELLIFFE.COM.AU> From: Matthew Sergeant (EML) > Can you do namespaces if you don't have a domain name? The namespace can be any URI. You could use Microsoft's admirable UUID notation. "uri:uuid:put the UUID here" I don't know if URIs currently allow SGML Formal Public Identifiers. If they do, you can go "uri:fpi:-//your name here//ELEMENTS your description here//EN" or if you want to publish them, and you can get an ISBN number: "uri:fpi:+//ISBN your number here//ELEMENTS your description here//EN" Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Tue Feb 2 15:27:15 1999 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:08:25 2004 Subject: Is this invalid? Message-ID: <199902021526.PAA13192@stevenson.cogsci.ed.ac.uk> Consider: ]> The second attribute declaration is ignored. But is it still a validity error that its default value is not an NMTOKEN? (This occurred to me after seeing an article by Chris Maden in comp.text.xml, but in that case the answer was clear.) -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cvonsee at onramp.net Tue Feb 2 15:44:04 1999 From: cvonsee at onramp.net (Chris von See) Date: Mon Jun 7 17:08:25 2004 Subject: Compound Documents - necessary for success? In-Reply-To: <01BE4DDF.134A6140@grappa.ito.tu-darmstadt.de> Message-ID: <199902021543.JAA10430@mailhost.onramp.net> At 12:33 PM 2/1/99 +0100, Ron Bourret wrote: >I think that many of us have a notion of a "compound document" and "reusing >schemas" but that, for most of us, these notions don't go much beyond the >actual words and a hazy, utopic, AI-intensive dream that XML documents will >somehow magically recombine themselves to solve all of our problems. > >What I think a lot of people would like is to automagically combine these >two DTDs so that the following document is valid: > > > > > > > Joe Tall > Iowa Talls > > 3 > meters > > > > >This does not currently work for two reasons. First, there is no way to >express that a document is valid under two different DTDs. Second, the >above document is clearly not valid under either of the above DTDs. To >create such a document under the current spec, we need to rewrite >players.dtd: > > > > > > >%height; > I may be showing my gross ignorance of both XML and namespaces here, but isn't this at least part of the problem that namespaces were meant to address? Granted, as Ron pointed out, a human still has to make the decision as to whether "" is relevant to "", but once there is agreement on what "height.dtd" represents authors should be able to re-use that DTD wherever it makes sense. I see the "height.dtd" as being something established by a world standards organization that allows common representation of "vertical height" information across all applications that use such things. With that in mind, it should be straightforward to use it in this context. Chris ------------------------------------ "Beware of all enterprises that require new clothes." -- Thoreau xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Tue Feb 2 15:57:58 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:08:25 2004 Subject: Compound Documents - necessary for success? Message-ID: <01BE4ECC.5177E760@grappa.ito.tu-darmstadt.de> Chris von See wrote: > I may be showing my gross ignorance of both XML and namespaces here, but > isn't this at least part of the problem that namespaces were meant to > address? Granted, as Ron pointed out, a human still has to make the > decision as to whether "" is relevant to "", but once > there is agreement on what "height.dtd" represents authors should be able > to re-use that DTD wherever it makes sense. Nope -- no ignorance at all. This is very definitely the problem namespaces were meant to address. It's just that the original reaction of many people to the namespaces spec is that it somehow solves the "automagic combination" and "'valid' against multiple DTDs" problems as well. (Put another way, you can get lunch, but it's not free.) -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Feb 2 16:00:47 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:25 2004 Subject: Namespaces without a domain name In-Reply-To: <5F052F2A01FBD11184F00008C7A4A80001136AE1@eukbant101.ericsson.se> References: <5F052F2A01FBD11184F00008C7A4A80001136AE1@eukbant101.ericsson.se> Message-ID: <14007.8203.803886.346132@localhost.localdomain> Matthew Sergeant (EML) writes: > Can you do namespaces if you don't have a domain name? > > (I do, but I was thinking others might not, e.g. individuals rather than > companies). > > This might be answered by James Clark's document which I just started > reading, if it is, then I apologise. That's the nice thing about using URIs rather than domain names for namespaces: anyone who owns *any* URI can construct a namespace. For example, my personal (and badly out-of-date) home page is at http://home.sprynet.com/sprynet/dmeggins/ I don't own the sprynet.com domain name, but since I do have control over this URL, I can base namespaces on it: Of course, this isn't very stable, since I might switch ISPs at any time, but at least it's possible. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Tue Feb 2 16:42:42 1999 From: richard at goon.stg.brown.edu (Richard L. Goerwitz) Date: Mon Jun 7 17:08:25 2004 Subject: Is this invalid? References: <199902021526.PAA13192@stevenson.cogsci.ed.ac.uk> Message-ID: <36B72A7A.10032783@goon.stg.brown.edu> Richard Tobin wrote: > > Consider: > > > > > ]> > This is valid. Duplicate ATTLIST decls are okay. If a particular attribute such as "a" is declared more than once, the first declaration holds. Note that elicits a warning from some validators, because it uses the special empty-tag syntax, without having been ex- plicitly declared EMPTY (one of those "for interoperability" constraints). If you have questions about such things, often you can just run some sample text through our (STG's) validator, http://www.stg.brown.edu/service/xmlvalid/. Its output is pretty much self-explanatory. Here's what it says when I paste in the above document instance: ================================================================================================ Validation Results for [user-supplied text] A list of warning messages follows: line 4, [user-supplied text]: warning (652): element has more than one attlist declaration: foo line 4, [user-supplied text]: warning (581): discarding duplicate attribute definition: a line 6, [user-supplied text]: warning (1106): empty-tag syntax used for element not declared with EMPTY content model: foo ================================================================================================ Document validates OK. ================================================================================================ -- Richard Goerwitz PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0 For more info (mail, phone, fax no.): finger richard@goon.stg.brown.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mathieu.Mangeot at xrce.xerox.com Tue Feb 2 16:45:26 1999 From: Mathieu.Mangeot at xrce.xerox.com (Mathieu Mangeot Lerebours) Date: Mon Jun 7 17:08:26 2004 Subject: how to browse xml files with xsl style sheet LOCALLY ? Message-ID: <36B72B0C.E6D21BC4@xrce.xerox.com> Hello, I'm trying to write a xsl for my xml documents. As I'm not an expert in xsl, I wanted to start from an example. I'm working with ie5b2. I found an example at this address : http://www.silab.dsi.unimi.it/~sz475745/etl/rivista/Sommario.xml I can browse it perfectly. Then I decided to copy this file on my local disk. I also copyed the dtd and the xsl files on my local disk. But when I browse the local copy, msxml generates an error : =================================================== The XML page cannot be displayed Cannot view XML input using XSL style sheet. Please correct the error and then click the Refresh button, or try again later. -------------------------------------------------------------------------------- MSXML error detected: 80070005 Line 4, Position 1 ^ ================================================= I tried with another example from www.microsoft.com and encountered the same problem. Next, there was the same example but with css. This time, I could browse my local copy without any problem. So : What can I do to browse xml files with xsl locally ? thank you for your answers Mathieu -- Mathieu MANGEOT-LEREBOURS | Phone : +33 4 76 61 51 32 Xerox Research Centre Europe | Fax : +33 4 76 61 50 99 6 chemin de Maupertuis | E-mail: Mathieu.Mangeot@imag.fr F-38240 Meylan FRANCE | http://www.xrce.xerox.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jmcdonou at library.berkeley.edu Tue Feb 2 17:01:30 1999 From: jmcdonou at library.berkeley.edu (Jerome McDonough) Date: Mon Jun 7 17:08:26 2004 Subject: Since we're talking about databases... In-Reply-To: <14006.59502.362962.284178@localhost.localdomain> References: <36B65517.8FAE4EC5@manhattanproject.com> <3.0.32.19990201164621.00a5e360@pop.intergate.bc.ca> <36B65517.8FAE4EC5@manhattanproject.com> Message-ID: <3.0.5.32.19990202084714.00be3400@library.berkeley.edu> At 07:02 AM 2/2/1999 -0500, David Megginson wrote: > >2. JDBC will let people put exactly as much power as they need on the > back end, from MSQL or PostgreSQL (or, if they're masochists, > Access) for light-weight work, to Oracle, Sybase, DBII, etc. for > heavy-grade stuff. > > I agree with David completely on the JDBC issue, but a word of warning from someone who's been there. *Don't* think that you can do this with Microsoft Access (not sure about other Microsoft products). Having spent a month trying to persuade both Intersolv's and Microsoft's JDBC/ODBC bridges to not blow up in my face after three queries into Access, I think I can vouch for that not being a stable solution. I tried the same Java code and same relational structure under Sybase and it worked just fine, however. I'd stick with databases that support JDBC directly. Jerome McDonough -- jmcdonou@library.Berkeley.EDU | (......) Library Systems Office, 386 Doe, U.C. Berkeley | \ * * / Berkeley, CA 94720-6000 (510) 642-5168 | \ <> / "Well, it looks easy enough...." | \ -- / SGNORMPF!!! -- From the Famous Last Words file | |||| xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Tue Feb 2 18:01:51 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:26 2004 Subject: Compound Documents - necessary for success? Message-ID: Chris von See wrote: > [snip on Ron Bourret's comments on how to use two DTDs in the same XML document] > I may be showing my gross ignorance of both XML and namespaces here, but > isn't this at least part of the problem that namespaces were meant to > address? In response Ronald Bourret wrote: > Nope -- no ignorance at all. This is very definitely the problem > namespaces were meant to address. Well, sort of. But note that you can use namespaces even without using a DTD. A standalone document that is well-formed might still want to have the following, in order to provide clues to any application that is processing it: Fred blonde In this example, if the application was going to process 'type' in a certain way (say an XSL processor), then it needs the namespaces to help it work out which 'type' is which. Also, if the namespace wasn't there, then you wouldn't be able to use both 'type' attributes in the same element. But note that no schemas need be involved. Conversely, if you devise a DTD that uses another DTD, you don't necessarily need to use namespaces. In the height/player example given before there is no ambiguity, so why would you introduce a namespace? So, back to compound documents. I think as a stop-gap you need dynamic DTDs. Just as many features of XML are best implemented using dynamic documents generated from databases, why not generate a top-level DTD that contains whatever lower level DTDs it needs to define the relevant compound XML document? The top DTD would include some basic stuff for containing a list of documents, and then include whatever other DTDs it needs for each document in turn. Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From CBerry at works.com Tue Feb 2 18:13:05 1999 From: CBerry at works.com (Chris Berry) Date: Mon Jun 7 17:08:26 2004 Subject: Compositional Documents Message-ID: <958E41703996D21197A200A0C9D4C65672B7@AUS-SERVER4> Greetings, Sorry for the wasted bandwidth, but I posted this last night (without a Subject) and did not receive any response. I'm hoping this time around I might be luckier... I am a relative newbie to XML, so please bear w/ me if this is a poor question... It looks like everything in the DOM must be dealt w/ in the context of a particular document. It seems you cannot drop one Document into another, or even one DocumentFragment into another Document context. I have tried to do the following... (using Microsoft's implementation of the DOM) IXMLDOMDocument doc = (IXMLDOMDocument) new DOMDocument(); IDOMDocumentFragment docFrag = doc.createDocumentFragment(); IDOMElement userElem = doc.createElement("USER"); userElem.setAttribute("TIMESTAMP", new Variant("123456") ); docFrag.appendChild( userElem ); doc.appendChild( docFrag ); System.out.println( doc.getXml() ); // >>>> Yields the correct results IXMLDOMDocument doc2 = (IXMLDOMDocument) new DOMDocument(); doc2.appendChild( docFrag ); System.out.println( doc2.getXml() ); // >>>> Yields an empty string It seems like this should work to me. Shouldn't we be able to cut and paste between documents?? Or more important, shouldn't we be able to build up documents compositionally?? I.e. compose Address 1 and Address 2 as DocumentFragments (or as their own Documents -- with their own DTDs) and then drop these into, say, User (providing that the User DTD agrees). I must be missing something here. Thanks in advance, Cheers, -- Chris Chris Berry cberry@works.com 512-231-1341 works.com 6850 Austin Center Blvd. Austin, TX 78731 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From danf at yipinet.com Tue Feb 2 19:34:50 1999 From: danf at yipinet.com (Dan Finkelstein) Date: Mon Jun 7 17:08:26 2004 Subject: Stripping leading and trailing spaces??? Message-ID: <000001be4ee3$060e5d60$a39b56d1@danf> It seems that leading and trailing spaces are being removed from PCDATA before it is returned to me. For example, Hello there! Message-ID: <36B75B1B.E2EBCCC1@mecomnet.de> Ronald Bourret wrote: > > Nope -- no ignorance at all. This is very definitely the problem > namespaces were meant to address. It's just that the original reaction of > many people to the namespaces spec is that it somehow solves the "automagic > combination" and "'valid' against multiple DTDs" problems as well. (Put > another way, you can get lunch, but it's not free.) how about this for a synopsys: it guarantees you your own seat at the bar, but you still going to have to be old enough to drink. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Tue Feb 2 21:06:09 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:08:26 2004 Subject: 5.3 and Appendix A contradiction [Re: Another errata?] References: <36B70FE4.B38B0E13@locke.ccil.org> Message-ID: <36B769E4.DC41937B@mecomnet.de> John Cowan wrote: > > Mark Birbeck wrote: > > > OK then, some have argued, at least shouldn't 'isbn' automatically be > > part of the 'bk' namespace? Still no, I'm afraid. Every member of a > > namespace is meant to be unique. > > Well, not quite. An element can have the same name as a global > attribute without problem. In a similar vein, one unqualified attribute defined for one element can have the same name as another unqualified attribute of another element without sharing the same declaration. More than one kind of "namespace" is necessary to interpret an xml document which conforms to the spec. One kind of space maps QName's and Name's to identifiers. Another kind maps identifiers to declarations. The namespace spec itself does not do justice to this, and, in fact, introduces - in relation to the notion of universal attribute names - the impression that XML requires a Name -> Identifier -> Attribute-Declaration mapping/namespace, when it does not. While one could accept that the illusion of such a namespace is helpful for things like XSL patterns, this kind of namespace is not entailed by the xml spec itself. It specifies a mapping of the form Name -> Identifier -> Identifier -> Attribute-Declaration. That is, it requires an element identifier in addition to an attribute identifier in order to identify a declaration. > ... There is a contradiction, AFAI can see, > between clause 5.3 and Appendix A, and obviously 5.3 is normative. > ? While I recognize a contradiction between Bray's interpretation of the body of the spec and its Appendix A, I don't seen one between the spec itself and Appendix A. Following A.3, example 1, line 5, the form from 5.3 would have the equivalent (extended) Clark encoding (http://www.jclark.com/xml/xmlns.htm) <{http://www.w3.org}x xmlns:n1="http://www.w3.org" xmlns="http://www.w3.org" > <{http://www.w3.org}good {{http://wwww.w3.org}good}a="1" {{http://wwww.w3.org}good}b="2" /> <{http://www.w3.org}good {{http://wwww.w3.org}good}a="1" {http://www.w3.org}a="2" > which would appear conformant. Where is the contradiction? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Tue Feb 2 21:12:44 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:26 2004 Subject: Another errata? Message-ID: John Cowan wrote: > Mark Birbeck wrote: > > > OK then, some have argued, at least shouldn't 'isbn' > automatically be > > part of the 'bk' namespace? Still no, I'm afraid. Every member of a > > namespace is meant to be unique. > > Well, not quite. How not quite? A namespace is a set of unique entries *by definition*. That's what they are - a set of unique entries. You can't have a 'set of unique entries' that contains a duplicate. > An element can have the same name as a global > attribute without problem. True. But they are not in the same namespace. According to A.2 the element would be in the 'all element types' partition, and the global attribute would be in the 'global attribute' partition. > There is a contradiction, AFAI can see, > between clause 5.3 and Appendix A, and obviously 5.3 is normative. Well why not spell it out then, since I can't see one? I can only guess what you are driving at, so I'll go through 5.3 (anyone interested in this might want to open http://www.w3.org/TR/REC-xml-names/ whilst reading): In XML documents conforming to this specification, no tag may contain two attributes which: 1. have identical names, or 2. have qualified names with the same local part and with prefixes which have been bound to namespace names that are identical. Well, number 1 is already given by XML 1.0. You cannot have: so I assume that is not where your contradiction lies. To look at the second point; the following *would* be OK in XML 1.0 (colons are fine, as per http://www.w3.org/TR/PR-xml-971208#NT-Nmtoken): However, the spec is saying that it becomes illegal - if you want your XML document to conform to the namespace spec - to use the above attribute names if you have bound the prefixes to namespaces as follows: because first:name and sur:name would end up being the same thing (uri:mycodes:name). (Note that it still does NOT break the XML 1.0 spec.) A.4 suggests how this might be implemented using 'expanded attribute names', saying that every EAN must be unique. (EANs include the URI that a namespace prefix points to as an attribute, not the prefix itself.) Of course, this might not be the source of your contradiction either. Given there's only one bit of 5.3 that I haven't discussed here (it's not that big), I guess you think the contradiction is in the 'good' example: However, each of the following is legal, the second because the default namespace does not apply to attribute names: I can only assume that you mean that this breaks the uniqueness rule, but it doesn't. If we look at the attributes both from the perspective of partitions or of expanded attribute names we will see that they are completely unique. PARTITIONS 'a' and 'n1:a' are in different namespace partitions, as described above. 'n1:a' is actually a global attribute and so appears in the 'global attribute' partition, whilst 'a' would be in the 'per-element-type' partition for the element 'good'. If we use '^' to break partitions, namespaces and local names, your have two completely different attributes. 'n1:a' becomes: GA^http://www.w3.org^a where 'n1' has been mapped to its full URI, and 'a' becomes: PET^good^a i.e., 'a' is in the 'good' namespace within the 'per-element-type' partition (because it is unqualified). EXPANDED ATTRIBUTE NAMES 'a' and 'n1:a' also have very different expanded attribute names. 'a' has: whilst 'n1:a' has: As you can see - completely different. One actually has a namespace attribute the other doesn't, and instead gets attributes of the type and namespace of the element that owns it. So where is the contradiction? I don't know why these standards guys are getting such a hard time - particularly with all these poorly worked through criticisms. IMHO if you want to be at the leading edge of technology then you have to put the work in - read, read, and re-read. It is not the job of standards developers to make sure we understand everything they write, although there is no problem with asking for clarification. But if instead of the phrase 'I don't understand' you choose 'there is a contradiction' then at least show some respect for their work and take the trouble to spell out what you think it is - and don't be surprised if a rant comes back ;-) Rant over ... Regards, Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From fasthand at bigfoot.com Tue Feb 2 21:55:47 1999 From: fasthand at bigfoot.com (fasthand@bigfoot.com) Date: Mon Jun 7 17:08:26 2004 Subject: ANN: ezDTD v1.5 -- DTD Editor/Generator/Formatter In-Reply-To: <3.0.1.16.19990114192641.4b97656e@pop3.demon.co.uk> References: <5F052F2A01FBD11184F00008C7A4A80001136A14@eukbant101.ericss on.se> Message-ID: <199902022200.QAA52233@cotton.vislab.olemiss.edu> ezDTD v1.5 DTD Editor/Generator/Formatter ---------------------------------------------------------------- FOA, please forgive me if you receive this mail more than once. ezDTD is a freeware for DTD authoring. (WIN95+ or NT) - ezDTD can be use as as an alternative besides commercial DTD editor and text editor, and - ezDTD can convert DTD to HTML with internal links. ezDTD is a text-base DTD editor. Unlike some commercial DTD tools which present DTD in graphic mode, ezDTD forms the DTD into HTML format. So author can navigate the document via hyperlinks. The latest ezDTD is v1.5. You can fint it at http://www.geocities.com/SiliconValley/Haven/2638/ezDTD.htm o Why create ezDTD? ezDTD, as a handy tool, it can help 1. Quickly jumping from one element to another. 2. Complete the typing by filling something like ANY, EMPTY, #IMPLIED .. etc. 3. Export a HTML-format DTD file which has internal links among elements. Since this version ezDTD can import existing DTD, you can use it to create HTML-format document for existig DTD as well. o What's new? Version 1.5 (1999-01-20) - Simplified interface. - View/Browse the DTD in HTML format on the fly. Version 1.1 (1998-02-12) - Modify some interface. - You can import a DTD file. As long as it does not have too complex comment structure. - Support Start Tag and End Tag definition. - Export DTD in either SGML or XML fashion (with or without the minization) - Correct the including example file appraisal.edz which did not explain itself clear enough. o Download Please check out http://www.geocities.com/SiliconValley/Haven/2638/ezDTD.htm Thanks for your time Duncan Chen ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Duncan Chen fasthand@bigfoot.com FNC, Inc. 601-232-1218 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Tue Feb 2 22:03:34 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:08:26 2004 Subject: Namespaces Message-ID: <5BF896CAFE8DD111812400805F1991F708AAEEFA@RED-MSG-08> James has done an outstanding job of explaining what namespaces are and are not. I especially recommend his document because it accurately stops short of implying that there is more to the namespaces specification than is there. --Andrew Layman co-editor of the Namespaces spec xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Tue Feb 2 22:15:55 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:08:26 2004 Subject: Another errata? References: Message-ID: <36B778F8.B3FCCFD7@locke.ccil.org> Mark Birbeck wrote: > How not quite? A namespace is a set of unique entries *by definition*. > That's what they are - a set of unique entries. You can't have a 'set of > unique entries' that contains a duplicate. That is not the definition of "namespace" used by REC-xml-names. Clause 1 specifically denies that XML namespaces are sets. > > An element can have the same name as a global > > attribute without problem. > > True. But they are not in the same namespace. According to A.2 the > element would be in the 'all element types' partition, and the global > attribute would be in the 'global attribute' partition. They are in separate partitions of the same namespace. > Of course, this might not be the source of your contradiction either. Appendix A says that unprefixed attributes are assigned to one of the per-element-type partitions of the namespace. It also says that unprefixed attributes are assigned to "associated namespaces". Clause 5.3 is not involved and I shouldn't have dragged it in. But none of this matters much because Appendix A is not normative. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msf at mds.rmit.edu.au Tue Feb 2 22:21:35 1999 From: msf at mds.rmit.edu.au (Michael Fuller) Date: Mon Jun 7 17:08:26 2004 Subject: What is a good database for very large collections? (was Re: XSL/ECMAscript (was RE: Frontier as a scalable XML repository (was Re: Is XML dead already or what?)) In-Reply-To: ; from Rick Jelliffe on Tue, Feb 02, 1999 at 04:18:11AM +1100 Message-ID: <19990203092046.A11167@io.mds.rmit.edu.au> On Tue, Feb 02, 1999, Rick Jelliffe asked: > What is a good database for XML? [Warning: product-related discussion follows.] Why not look at systems that are built around an SGML data model, rather than built over the top of a OODBMS or RDBMS model? SIM, the "Structured Information Manager", is a commercial XML/SGML system produced here at RMIT (I am not personally involved in its development but work with people who are; they are the source for some of the following; see http://www.simdb.com/ for more information). For me, the key issue is that SIM stores and parses XML documents natively. SIM is optimised to efficiently store, index, and retrieve XML/SGML documents from multi-gigabyte collections under heavy concurrent user loads. XML/SGML documents can also be formated or manipulated using an SGML scripting language (free from http://ace.mds.rmit.edu.au/). XML/SGML document structure can be directly queried. Although SIM does not directly AI and linguistic software, it does have free text querying, ranking, lots of text searching functions; it also has the ability to define standing queries so that new records that match a standing query are brought to the attention of users (normally via nightly email). On the scalabity side of things, SIM supports incremental updates on live systems. SIM had been used by customers with 800 'concurrent' users over the web; by customers with databases of >6gb and >3,000,00 records; to run in-house test collections of around 50gb. Michael Fuller ____________________________________________________________________________ Multimedia Databases Systems, Phone: +61 3 9925 4148 RMIT University Fax: +61 3 9925 4098 Level 3, 110 Victoria St, msf@mds.rmit.edu.au Carlton 3053, Australia. http://www.mds.rmit.edu.au/~msf/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From derekdb at microsoft.com Tue Feb 2 22:29:59 1999 From: derekdb at microsoft.com (Derek Denny-Brown) Date: Mon Jun 7 17:08:26 2004 Subject: Compositional Documents Message-ID: <8B57882C41A0D1118F7100805F9F68B506F1BD55@RED-MSG-45> The problem with the code below is that the first: doc.appendChild( docFrag ); moves the contents of docFrag into doc. It does not copy/clone or otherwise duplicate, but rather removes the contents of the document fragment, and appends them to the list of children of doc. Before executing that line, the following would be a true assertion: docFrag.getChildNodes().getLength() == 1 after executing that line the following is true docFrag.getChildNodes().getLength() == 0 Thus executing the following tidbit after the above doc.appendChild(docFrag), is an obfuscated no-op. doc2.appendChild( docFrag ); -derek -----Original Message----- From: Chris Berry [mailto:CBerry@works.com] Sent: Tuesday, February 02, 1999 10:11 AM To: 'xml-dev@ic.ac.uk' Subject: Compositional Documents Greetings, Sorry for the wasted bandwidth, but I posted this last night (without a Subject) and did not receive any response. I'm hoping this time around I might be luckier... I am a relative newbie to XML, so please bear w/ me if this is a poor question... It looks like everything in the DOM must be dealt w/ in the context of a particular document. It seems you cannot drop one Document into another, or even one DocumentFragment into another Document context. I have tried to do the following... (using Microsoft's implementation of the DOM) IXMLDOMDocument doc = (IXMLDOMDocument) new DOMDocument(); IDOMDocumentFragment docFrag = doc.createDocumentFragment(); IDOMElement userElem = doc.createElement("USER"); userElem.setAttribute("TIMESTAMP", new Variant("123456") ); docFrag.appendChild( userElem ); doc.appendChild( docFrag ); System.out.println( doc.getXml() ); // >>>> Yields the correct results IXMLDOMDocument doc2 = (IXMLDOMDocument) new DOMDocument(); doc2.appendChild( docFrag ); System.out.println( doc2.getXml() ); // >>>> Yields an empty string It seems like this should work to me. Shouldn't we be able to cut and paste between documents?? Or more important, shouldn't we be able to build up documents compositionally?? I.e. compose Address 1 and Address 2 as DocumentFragments (or as their own Documents -- with their own DTDs) and then drop these into, say, User (providing that the User DTD agrees). I must be missing something here. Thanks in advance, Cheers, -- Chris Chris Berry cberry@works.com 512-231-1341 works.com 6850 Austin Center Blvd. Austin, TX 78731 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cvonsee at onramp.net Tue Feb 2 22:52:15 1999 From: cvonsee at onramp.net (Chris von See) Date: Mon Jun 7 17:08:27 2004 Subject: Compound Documents - necessary for success? In-Reply-To: Message-ID: <199902022251.QAA09621@mailhost.onramp.net> At 05:52 PM 2/2/99 +0000, Mark Birbeck wrote: > >Well, sort of. But note that you can use namespaces even without using a >DTD. A standalone document that is well-formed might still want to have >the following, in order to provide clues to any application that is >processing it: > > > > Fred > > blonde > > > >In this example, if the application was going to process 'type' in a >certain way (say an XSL processor), then it needs the namespaces to help >it work out which 'type' is which. Also, if the namespace wasn't there, >then you wouldn't be able to use both 'type' attributes in the same >element. But note that no schemas need be involved. Without a DTD or schema on which to "hang your hat", so to speak, you're vesting the application with the knowledge of what the various namespace-qualified constructs mean. This strikes me as a Very Bad Thing, because it leaves individual applications to interpret (potentially, interpret very differently) what a particular attribute or element means. Generating a specific meaning and rules for usage for a particular DTD or schema isn't "automagic", but at least it forces people to think about what they're doing and lends some semblance of order. For small applications you can get away with not having such an anchor point, but for anything larger than a few documents you're asking for trouble. > >Conversely, if you devise a DTD that uses another DTD, you don't >necessarily need to use namespaces. In the height/player example given >before there is no ambiguity, so why would you introduce a namespace? Granted, as long as there are no conflicts in naming between the DTDs. However, as soon as you change a DTD you get into problems with DTD versioning, and as soon as you clone it to add the new DTD reference the proliferation of DTDs introduces a potential management problem. Yuk. > >So, back to compound documents. I think as a stop-gap you need dynamic >DTDs. Just as many features of XML are best implemented using dynamic >documents generated from databases, why not generate a top-level DTD >that contains whatever lower level DTDs it needs to define the relevant >compound XML document? The top DTD would include some basic stuff for >containing a list of documents, and then include whatever other DTDs it >needs for each document in turn. In order to do what you're proposing, you have to have knowledge *somewhere* of what is and is not acceptable... just as RDBMS catalogs govern the creation of dynamic DTDs for database data, so you would have to provide some sort of intelligence in your DTD generation mechanism to prevent unwanted, ambiguous usage. Chris ------------------------------------ "Beware of all enterprises that require new clothes." -- Thoreau xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Tue Feb 2 22:56:00 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:27 2004 Subject: 5.3 and Appendix A contradiction [Re: Another errata?] Message-ID: james anderson wrote: > John Cowan wrote: > > > > Mark Birbeck wrote: > > > > > OK then, some have argued, at least shouldn't 'isbn' > automatically be > > > part of the 'bk' namespace? Still no, I'm afraid. Every > member of a > > > namespace is meant to be unique. > > > > Well, not quite. An element can have the same name as a global > > attribute without problem. [I've replied to this in the 'Another errata?' thread] > In a similar vein, one unqualified attribute defined for one > element can have > the same name as another unqualified attribute of another > element without > sharing the same declaration. A very odd response! Follow the three comments through: - in response to demands for more errata I say that an unqualified attribute should *not* automatically join the namespace of its element - as part of my justification I say that all members of a namespace must be unique, and then go on - John disagrees - although why I don't know, because that's what a namespace is - and backs up his argument by saying that an element and a global attribute can have the same name. But that doesn't prove that namespaces can have duplicates, because they are actually in different namespaces - and then James comes to his assistance by adding that an unqualified attribute for one element can have the same name as an unqualified attribute on another element BUT WHY?! Well ... because an unqualified attribute does not automatically join the namespace of its element!! If it did then you would have duplicates in your namespace, because they are different attributes. Sound familiar? > More than one kind of "namespace" is necessary to interpret > an xml document > which conforms to the spec. One kind of space maps QName's > and Name's to > identifiers. If we're going to pursue this to the bitter end ... you can't have a 'Name' in a conforming doc, they must all be 'QNames' (no colons except after a prefix). > The namespace spec itself does not do justice to this, and, in fact, > introduces - in relation to the notion of universal attribute > names - the > impression that XML requires a Name -> Identifier -> > Attribute-Declaration > mapping/namespace, when it does not. While one could accept > that the illusion > of such a namespace is helpful for things like XSL patterns, > this kind of > namespace is not entailed by the xml spec itself. It > specifies a mapping of > the form Name -> Identifier -> Identifier -> > Attribute-Declaration. That is, > it requires an element identifier in addition to an attribute > identifier in > order to identify a declaration. This may be true - that XML does NOT require this - but I think the spec rightly draws attention to its need in many applications. Take for example the difficulty of dealing with global attributes. Say that I want every single node in a magazine article to have a unique identifier so that if it gets edited I can put it back into the database easily. I might define:
My Article An interesting thingy. More of the same.
Without the namespace spec, 'id' is in four different namespaces. Now you're right that each has context: article->id article->title->id and so on, but I disagree with you saying that this contextual information 'is enough'. What if I wanted to process all database 'id' values? Say my application needs to work out if any nodes have been deleted before returning the data to my database. Well OK, you could say, in XSL-style syntax: */attribute(id) or whatever it is. That is - give me all elements that contain an 'id' attribute. But what if we were to add an additional 'id' attribute, say for use by some other site to allow paragraphs to be quoted. And say also, that it was not a requirement of each node to have a database 'id' value. Then without namespaces we get:
My Article An interesting thingy. More of the same.
Now my poor old database application cannot tell the difference between the database version of 'id' on 'article', 'title' and 'paras' and the reference version of 'id' on 'para' - unless of course we process every single node that comes back from the '*/attribute(id)' query, or we query for each of the possible types. But with namespaces I can say 'give me every id value in the uri:mydatabasespec namespace'. Of course, you could argue that I should just name my attributes something else - 'db:id' perhaps - and then I can query them uniquely. But what if I want to use two DTDs, and someone else has used 'db:id'? I'm back to square one again. Whichever way you turn, if you want to assist applications built on an XML processor - and I believe 'assist' is all that is being said in the spec - namespaces are very, very handy. Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nikita.ogievetsky at csfb.com Tue Feb 2 23:19:41 1999 From: nikita.ogievetsky at csfb.com (Ogievetsky, Nikita) Date: Mon Jun 7 17:08:27 2004 Subject: Another errata? Message-ID: <9C998CDFE027D211B61300A0C9CF9AB44246E3@SNYC11309> Mark Birbeck wrote: > xmlns:bk='urn:loc.gov:books'> > > Cheaper by the Dozen > > > A list of loads of books > > >In this case, 'auto-joining' an attribute to its element's namespace >would make 'isbn' into a *global* attribute. Handy, if you wanted to >process all bk:isbn numbers in a document - but wrong!! In this document >we do NOT have two instances of bk:isbn, we have one of bk:book:isbn and >one of bk:catalogue:isbn. .............................. >this is OK: > y > >and this is not good practice at all (see Andrew Layman's comments on >this, too): > y > Seems that we should think of the namespace scope here. Imagine if Schema designer was not creative and instead of isbn used id And to identify authors he also used id attribute, say ssn (being picky, one can invent a datatype for ssn: xxx-xx-xxxx) Cheaper by the Dozen Now how do we know if book's id belong to default namespace or to bk? Or, if child inherits namespace from parents, should not the above be equal to: Cheaper by the Dozen It should because I should freely be able to convert this XML to another where title is an attribute. equal to so either both are valid or not. Now, what if inside the book we want to use default id (ssn) - for editor? Or can't we ? (trully I can not imagine a reasoble situation to do so...) There is no prefix for it. Looks like a mess... Any toughts? Nikita Ogievetsky Cogitech Inc http://www.cogx.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Tue Feb 2 23:32:39 1999 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:08:27 2004 Subject: Since we're talking about databases... References: <3.0.32.19990201164621.00a5e360@pop.intergate.bc.ca> <36B65517.8FAE4EC5@manhattanproject.com> Message-ID: <36B78A5B.2FAF@hiwaay.net> Take a look at the papers of Dr. Michael Stonebraker (www.informix.com) on the design of relational databases with object support designed into the kernel which enable user-defined extensions, support for complex types, etc. His claim is that this approach is the new generation of db. He posits arguments and figures for the performance of this approach over middleware that might be interesting. I find myself wondering about how this approach would work if XML support were built into the kernel. Stonebraker posits it would be far superior for 3D. Since we have some experience with geodata and the performance problems of keeping this in the relational db (not the way to go), I wonder if architectural issues with XML systems that rely on middleware will be similar. len bullard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Feb 2 23:43:25 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:27 2004 Subject: Compound Documents - necessary for success? In-Reply-To: <199902022251.QAA09621@mailhost.onramp.net> References: <199902022251.QAA09621@mailhost.onramp.net> Message-ID: <14007.35658.443324.182548@localhost.localdomain> Chris von See writes: > Without a DTD or schema on which to "hang your hat", so to speak, > you're vesting the application with the knowledge of what the > various namespace-qualified constructs mean. This strikes me as a > Very Bad Thing, because it leaves individual applications to > interpret (potentially, interpret very differently) what a > particular attribute or element means. The situation that Chris describes exists with or without a DTD. Imagine that I have a DTD containing the following: I know where

is allowed to appear, and I know what it is allowed to contain, but I know *nothing* more about what it means -- that information is still hard-coded in an application somewhere. I am a big fan of DTDs and have even written a book on them, but I don't buy the 'discipline-of-writing-a-DTD-makes-you-think' model any more than I buy the 'discipline-of-learning-Latin-makes-you-think' model (and I enjoy reading Latin). Namespaces actually help the problem a bit: they still do not tell me what an element means (and I will be stunned if the result of the XML Schema WG's work does that either), but at least they provide a global point of reference. I do not know if

in document A and

in document B are meant to have anything in common, but I do know that (using James Clark's notation) <{http://www.megginson.com/ns/doc/}p> in document A and <{http://www.megginson.com/ns/doc/}p> in document B are meant to have something in common. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Tue Feb 2 23:44:01 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:27 2004 Subject: Another errata? Message-ID: John Cowan wrote: > Mark Birbeck wrote: > > How not quite? A namespace is a set of unique entries *by > definition*. > > That's what they are - a set of unique entries. You can't > have a 'set of > > unique entries' that contains a duplicate. > > That is not the definition of "namespace" used by REC-xml-names. > Clause 1 specifically denies that XML namespaces are sets. Clause 1 refers to the definition of 'XML namespace', which, as you rightly say, is not a set. But as I understand it an 'XML namespace' comprises numerous 'namespaces' (in the sense in which I used it) which each DO have unique entries, because they *are* sets. My understanding of the motivation for this is that if a single 'namespace' (i.e. set) was used then we would lose contextual information about a name. We therefore partition them in order to preserve this information. An 'XML namespace' is therefore a 'namespace container' that is not itself a 'namespace'. Clause 1 seems to put the emphasis on the presence of structure, rather than the lack of uniqueness. > > > An element can have the same name as a global > > > attribute without problem. > > > > True. But they are not in the same namespace. According to A.2 the > > element would be in the 'all element types' partition, and > the global > > attribute would be in the 'global attribute' partition. > > They are in separate partitions of the same namespace. No again - they are in separate partitions of the same 'XML namespace'. Each partition is itself a namespace. A.2 has it that: ".. we identify the names appearing in an XML namespace as belonging to one of several disjoint traditional (i.e. set-structured) namespaces ..." > Appendix A says that unprefixed attributes are assigned to one of the > per-element-type partitions of the namespace. It also says that > unprefixed attributes are assigned to "associated namespaces". Nothing wrong with that. Each element appears by name in the 'all elements type' partition. However, each element also has its own partition - a 'per-element-type' partition - into which are collected all the unqualified attributes for that element. This PET partition is a true 'namespace' (in the sense I used it) because from XML 1.0 we know that we cannot have duplicate attributes on an element. And that is why the spec refers to it as an 'associated namespace' - associated to the element. > Clause 5.3 is not involved and I shouldn't have dragged it in. I see. > But none of this matters much because Appendix A is not normative. I suppose so. But it does help to understand what it is that is making a particular element or attribute name unique with the entire document. I particularly think that the expanded attribute stuff is very useful. Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Tue Feb 2 23:46:11 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:08:27 2004 Subject: Since we're talking about databases... In-Reply-To: <36B78A5B.2FAF@hiwaay.net> Message-ID: <000701be4f05$b9282800$5118a8c0@kuantech1.quokka.com> Please please please let this list not degenerate into a philosophical debate about the merits of various approaches to database design. I have been following the relational vs. object vs. object-relational debate for some ten years now, and the mere mention of Dr. Stonebraker's name still makes me shiver. :-) Jeff -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of len bullard Sent: Tuesday, February 02, 1999 3:30 PM To: xml-dev@ic.ac.uk Subject: Re: Since we're talking about databases... Take a look at the papers of Dr. Michael Stonebraker (www.informix.com) on the design of relational databases with object support designed into the kernel which enable user-defined extensions, support for complex types, etc. His claim is that this approach is the new generation of db. He posits arguments and figures for the performance of this approach over middleware that might be interesting. I find myself wondering about how this approach would work if XML support were built into the kernel. Stonebraker posits it would be far superior for 3D. Since we have some experience with geodata and the performance problems of keeping this in the relational db (not the way to go), I wonder if architectural issues with XML systems that rely on middleware will be similar. len bullard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Wed Feb 3 00:04:49 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:27 2004 Subject: Compound Documents - necessary for success? Message-ID: Chris von See wrote: > Without a DTD or schema on which to "hang your hat", so to > speak, you're > vesting the application with the knowledge of what the various > namespace-qualified constructs mean. This strikes me as a > Very Bad Thing, > because it leaves individual applications to interpret (potentially, > interpret very differently) what a particular attribute or > element means. [and] > Generating a specific meaning and rules for usage for a > particular DTD or > schema isn't "automagic", but at least it forces people to > think about what > they're doing and lends some semblance of order. All a DTD can do is say something about the structure of a document. If someone wants to write an application that uses your CD collection data as a basis for their long-term investment plans on, you - and your DTD - can't stop them, other than by keeping the data to yourself. Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Feb 3 00:42:17 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:27 2004 Subject: Stripping leading and trailing spaces??? In-Reply-To: <000001be4ee3$060e5d60$a39b56d1@danf> References: <000001be4ee3$060e5d60$a39b56d1@danf> Message-ID: <14007.39575.773121.658045@localhost.localdomain> Dan Finkelstein writes: > It seems that leading and trailing spaces are being removed from > PCDATA before it is returned to me. For example, > > Hello there! > comes back as "Hello there!" instead of " Hello there! ". I'm > using Sun's early release parser, and I'm wondering if this is > standard XML behavior? if there is a way to preverse the spaces? Which interface are you using? If you're using SAX, it could be that the spaces are being delivered in separate characters() events (parsers are allowed to chunk up character data any way they want). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jamesr at steptwo.com.au Wed Feb 3 00:59:49 1999 From: jamesr at steptwo.com.au (James Robertson) Date: Mon Jun 7 17:08:27 2004 Subject: Interesting Monday. In-Reply-To: <5F052F2A01FBD11184F00008C7A4A80001136ADF@eukbant101.ericss on.se> Message-ID: <4.1.19990203115244.00bc0d40@steptwo.com.au> At 19:11 2/02/1999 , Matthew Sergeant (EML) wrote: | > I would personally recommend a third option: | > | > 3) Store in RDBMS now, process into XML, process this into HTML now. | > Process the XML into whatever you want in the future. | > | > | Nonononono. :) | | This generates probably 5% more overhead than I have already (the | RDBMS). XML doesn't parse quickly (well, OK, it parses quickly, but not | compared to reading data from an RDBMS). When you are processing tens of XML | files per second this becomes a huge problem. Well, I guess you have to balance elegance & expandability vs raw performance. Not an uncommon trade-off ... But, that being said ... Creating XML from an RDBMS is very quick, particularly when you do it using straightforward non-XML code. True, XML->HTML is not as quick as would be liked, but it again depends on the nature of the work. If your HTML needs a lot of complex cross-linking, tables of contents, navigation bars, etc, then doing this straight from the RDBMS can be a real bitch. Also, the speed of processing XML will depend on the tool. Have you considered using something like Omnimark, instead of DOM, etc? Just some more food for thought, J ------------------------- James Robertson Step Two Designs Pty Ltd SGML, XML & HTML Consultancy http://www.steptwo.com.au/ jamesr@steptwo.com.au "Beyond the Idea" ACN 081 019 623 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Wed Feb 3 02:59:56 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:08:27 2004 Subject: Namespaces Message-ID: <014801be4f21$17207230$2ee044c6@arcot-main> >James has done an outstanding job of explaining what namespaces are and are >not. I especially recommend his document because it accurately stops short >of implying that there is more to the namespaces specification than is >there. I agree wholehearted. I would like to reccommend that we use James' document as the start of the "XML Namespaces FAQ" and build on it. The question is not "can we understand it?" but "can my customers understand it?". Cheers, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Wed Feb 3 04:54:59 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:27 2004 Subject: Interesting Monday. References: <5F052F2A01FBD11184F00008C7A4A80001136ADF@eukbant101.ericsson.se> Message-ID: <36B7D4B1.4ADAB18F@prescod.net> "Matthew Sergeant (EML)" wrote: > > Nonononono. :) > > This generates probably 5% more overhead than I have already (the > RDBMS). XML doesn't parse quickly (well, OK, it parses quickly, but not > compared to reading data from an RDBMS). When you are processing tens of XML > files per second this becomes a huge problem. You could build a parse tree in a "servlet" and hand it right to an XSL stylesheet. The benefit of this is that you have the modularization of having style application be independent of your database structure but you don't actually create a file, start another process, parse the file and continue. Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Wed Feb 3 04:57:03 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:27 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) References: Message-ID: <36B7D362.1FF821D4@prescod.net> Mark Birbeck wrote: > > if that was a better model of the internal data. Maybe I've missed the > subtlety of what you are saying the problem is, but in our system the > attributes of an object are exported as described above, and the > children of an object are exported as elements within other elements. > Seems to me to mirror exactly our object structure - and so far we have > been able to re-interpret DTDs back as data definitions. In other words, > we *can* generalise the solution. If I understand correctly: * what you are saying is that you can define (have defined!) a convention such that you achieve lossless serialization of XML data through XML. What Eliot is saying is: * XML isn't doing the "heavy lifting" here, your *convention* is doing the heavy lifting. If someone handed you data that meant the same thing but did not use your convention you would be as lost as if the data were EBCDIC ISAM files. XML is doing the "light lifting" of allowing you to avoid writing a parser for EBCDIC ISAM files (or, more likely, LISP S-expressions). YEAH FOR XML! Helping with light lifting is important. Saving money is important. It just doesn't solve the hard problems. Non-ambiguous serialization has not been a hard problem since the invention of S-expressions. > Sure. But I still have two issues. First, why would you query the > serialisation anyway? Wouldn't you want to query your original database > and generate XML pages that reflect the results? You certainly would if you use XML as *only* a serialization. The thrust of this thread was that some people want to encode everything in XML so that they can "query it." But XML is a lousy query representation for anything other than human-authored documents. (and debatably the best thing for queries against those!) > You keep talking of the 'abstract' representation > of your data, but actually you are *losing* the abstraction, moving > from: > > a person who has the name Eliot > > to > > an object which contains another object which has two > properties, one set to name and the other set to Eliot > > Of course both are abstractions, but they model completely different > things (data and people). I think that this is Eliot's point. Consider someone who says that if you put a "DOM interface" on all objects everywhere in the system then they all become managable because they have a "single API." Smart (but usually inexperienced) people say this. These people are talking about an API to the serialization structure instead of an API to their original data. They've gained some uniformity but lost some abstraction. That is very seldom a useful trade-off. To make their processing useful again their very next step will be to add in some abstraction on top of the DOM. Then they're back to where they started. This is one of the most pervasive misunderstandings in XML-world. > And modelling the data rather than the person > means you can no longer interchange your XML with other systems because > you have two completely different sets of data, using different DTDs. I don't follow that. > (And you can't say that your serialisation schema *will* allow this > interchange, because although your serialised data may be well-formed, > the underlying data it represents may not be, so you need the proper DTD > for the object.) Well-formedness has very little to do with DTDs so I don't follow this either. > All I am saying is that the document *itself* could be the abstraction > of the data. This is something else I don't follow. XML documents are always encodings of abstractions. They are concrete, tangible, interchangable, printable and can be given global names. Concrete, not abstract. The objects they represent are logical, usually inaccessible outside of an "address space" (i.e. your brain, your relational database) and are thus termed abstract. The reason we need XSL is because the abstractions cannot "stand alone". I can't transmit a book from my head to your head. I need to serialize it on paper or online. I also can't transmit a "book object" without serializing it somehow (i.e. XML). Before serialization it is an abstraction. Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Wed Feb 3 05:38:22 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:08:27 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) In-Reply-To: <36B7D362.1FF821D4@prescod.net> Message-ID: <000e01be4f36$c7b29370$d3228018@jabr.ne.mediaone.net> Paul Prescod wrote: ... > > You certainly would if you use XML as *only* a serialization. The thrust > of this thread was that some people want to encode everything in XML so > that they can "query it." But XML is a lousy query representation for > anything other than human-authored documents. (and debatably the best > thing for queries against those!) > and later... > > I think that this is Eliot's point. Consider someone who says that if you > put a "DOM interface" on all objects everywhere in the system then they > all become managable because they have a "single API." Smart (but usually > inexperienced) people say this. These people are talking about an API to > the serialization structure instead of an API to their original data. > They've gained some uniformity but lost some abstraction. That is very > seldom a useful trade-off. To make their processing useful again their > very next step will be to add in some abstraction on top of the DOM. Then > they're back to where they started. > > This is one of the most pervasive misunderstandings in XML-world. You have missed the point here. If I put a DOM interface onto a SQL Server or Oracle or ODI or Poet database, I am hardly using an API to the serialization structure. When people say this, they mean that the DOM API/interface is used against the native datastore. The utility of this would demonstrate itself in a distributed environment where something like XQL was used as a query language. If we are in the relational db world, ODBC/SQL 92 provides an interface onto disparate databases. Not all information is stored on relational dbs. The DOM interface aims to provide the same database and vendor neutrality and interoperability that ODBC or JDBC provides for tabular data. If I am using a DOM interface, it frankly doesn't matter what the serialization format is, I am interacting directly with data through an interace. I wouldn't suggest that the DOM replace ODBC, yet I'm quite sure that those experienced using a variety of systems with disparate data types and data usages will appreciate that certain types of data are best expressed in tree format. Such data scenario's might best be interfaced with via the DOM. XSL transforms can be applied directly to DOM representations, rather than serialized XML documents. This yeilds the possibility that serial transforms be applied within 'DOM space' (assuming the XSL transform output is a DOM structure rather than a serialized string). The act, thus, of web page generation from a database can be automated via XSL rather than, say ASP or perl scripts. Is this useful? Sometimes it is. Are the DOM interfaces the best for all situations, clearly not. However if a significant percentage of people can agree to use them a significant percentage of the time, this is a big win. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Wed Feb 3 08:58:12 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:08:28 2004 Subject: Fw: Namespaces Message-ID: <008e01be4f52$a23b2410$5402a8c0@oren.capella.co.il> Don Park wrote: >I would like to recommend that we use James' document as the start of the >"XML Namespaces FAQ" and build on it. The question is not "can we >understand it?" but "can my customers understand it?". Excellent notion. While we are at it, here's a question which seems to be at the heart of the issue and which, alas, James didn't address: global attributes. Andrew Laymen has made a pretty good case that they simply don't exist. However most of the wrangling about namespaces seem to refer as to how they are handled. If Andrew is wrong, I'd like to know why. If he's right, what is the debate about? I note that this contradicts a sample which was not challenged by anyone: would have the equivalent (extended) Clark encoding (http://www.jclark.com/xml/xmlns.htm) <{http://www.w3.org}x xmlns:n1="http://www.w3.org" xmlns="http://www.w3.org" > <{http://www.w3.org}good {{http://wwww.w3.org}good}a="1" {{http://wwww.w3.org}good}b="2" /> <{http://www.w3.org}good {{http://wwww.w3.org}good}a="1" {http://www.w3.org}a="2" > >From reading Andrew's post I presumed it would be converted to: <{http://www.w3.org}x xmlns:n1="http://www.w3.org" xmlns="http://www.w3.org" > <{http://www.w3.org}good {{http://wwww.w3.org}good}a="1" {{http://wwww.w3.org}good}b="2" /> <{http://www.w3.org}good {{http://wwww.w3.org}good}a="1" {{{http://www.w3.org}good}http://www.w3.org}a="2" > On what is the previous interpretation based? If the second one is correct, wouldn't it solve all the issues? Take Mark's example of multiple use of 'id'. It would become immediately clear that the XSL pattern: */attribute(id) Would match: */attribute({}id) While the pattern: */attribute(db:id) Would match */attribute({{} Jerome McDonough wrote: > *Don't* think that you can do this with > Microsoft Access (not sure about other Microsoft products). Having spent > a month trying to persuade both Intersolv's and Microsoft's JDBC/ODBC bridges > to not blow up in my face after three queries into Access, I think I can > vouch for that not being a stable solution. The problem might be the bridge, not the Access driver or Access. I've been using Sun's JDBC/ODBC bridge with Microsoft's Access driver for quite a while without any problems. (Sun JDK 1.1.7, Windows NT 4.0 (service pack 3), MS Access driver 3.50.360200). -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Wed Feb 3 09:41:56 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:08:28 2004 Subject: Interesting Monday. Message-ID: <5F052F2A01FBD11184F00008C7A4A80001136AE8@eukbant101.ericsson.se> > -----Original Message----- > From: Paul Prescod [SMTP:paul@prescod.net] > > "Matthew Sergeant (EML)" wrote: > > > > Nonononono. :) > > > > This generates probably 5% more overhead than I have already > (the > > RDBMS). XML doesn't parse quickly (well, OK, it parses quickly, but not > > compared to reading data from an RDBMS). When you are processing tens of > XML > > files per second this becomes a huge problem. > > You could build a parse tree in a "servlet" and hand it right to an XSL > We're looking at several possibilities now to increase performance. I'm not terribly worried about my particular application. Although it will be a parse tree stored in mod_perl, not a servlet . The overall issue is now a concern for XML e-commerce. You can't cache or store parse trees for XML data that isn't static in any way, and that you only use once and then throw away. Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Wed Feb 3 09:48:40 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:08:28 2004 Subject: Interesting Monday. Message-ID: <5F052F2A01FBD11184F00008C7A4A80001136AE9@eukbant101.ericsson.se> > -----Original Message----- > From: James Robertson [SMTP:jamesr@steptwo.com.au] > > At 19:11 2/02/1999 , Matthew Sergeant (EML) wrote: > > | This generates probably 5% more overhead than I have already (the > | RDBMS). XML doesn't parse quickly (well, OK, it parses quickly, but > not > | compared to reading data from an RDBMS). When you are processing tens > of XML > | files per second this becomes a huge problem. > > Well, I guess you have to balance elegance & expandability vs raw > performance. Not an uncommon trade-off ... > Indeed. > But, that being said ... > > Creating XML from an RDBMS is very quick, particularly when you > do it using straightforward non-XML code. > > True, XML->HTML is not as quick as would be liked, but it > again depends on the nature of the work. If your HTML needs > a lot of complex cross-linking, tables of contents, navigation > bars, etc, then doing this straight from the RDBMS can be > a real bitch. > Actually, the thing I really like about the XML solution - and I think others will agree, is that it's a whole lot easier to update your XML structure than it is to update your database. Changing code that accesses a database is a lot harder than changing code that format's XML. > Also, the speed of processing XML will depend on the tool. > Have you considered using something like Omnimark, instead > of DOM, etc? > We don't use DOM. We use expat (using perl's XML::Parser and my CGI::XMLForm module). Expat is very quick. E.g. we can raw-parse (doing nothing in the parse phase) 100 of our files in 0.25 seconds, but it's 30 seconds to do the same with XML::DOM (another perl module - although that module has large overheads because it's a pure-perl solution). Of course that's not a very fair benchmark, because it's a lot easier to manipulate DOM than it is using expat, but the performance/flexibility tradeoff is worth it. Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Wed Feb 3 09:57:32 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:28 2004 Subject: Interesting Monday. References: <5F052F2A01FBD11184F00008C7A4A80001136AE8@eukbant101.ericsson.se> Message-ID: <36B81B22.865CE9D1@prescod.net> "Matthew Sergeant (EML)" wrote: > > The overall issue is > now a concern for XML e-commerce. You can't cache or store parse trees for > XML data that isn't static in any way, and that you only use once and then > throw away. Well, e-commerce is actually engaged in interchange so I grudgingly admit that XML is probably useful. ( :) ) Consider, however, that an optimized subset of XML with an optimized parser and optimized API would go faster. And if that fails, well, we can always go back to the last universal format: S-expressions! -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Wed Feb 3 10:04:05 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:28 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) References: <000e01be4f36$c7b29370$d3228018@jabr.ne.mediaone.net> Message-ID: <36B819D3.B65C07F3@prescod.net> "Borden, Jonathan" wrote: > > You have missed the point here. If I put a DOM interface onto a SQL Server > or Oracle or ODI or Poet database, I am hardly using an API to the > serialization structure. When people say this, they mean that the DOM > API/interface is used against the native datastore. The utility of this > would demonstrate itself in a distributed environment where something like > XQL was used as a query language. If we are in the relational db world, > ODBC/SQL 92 provides an interface onto disparate databases. Not all > information is stored on relational dbs. The DOM interface aims to provide > the same database and vendor neutrality and interoperability that ODBC or > JDBC provides for tabular data. If I am using a DOM interface, it frankly > doesn't matter what the serialization format is, I am interacting directly > with data through an interace. You are interacting with data through an interface that was designed to provide access to the abstract data model of a *serialization*. In other words you are treating your database as if it were the result of parsing an XML document. You've put an "elements and attributes" interface on data that is much more complex than elements and attributes. If it were not more complex then elements and attributes we would not need stuff like XLink, HyTime, Namespaces and RDF to even *attempt* (and fail) to represent it. The XML data model, whether a grove or a DOM is the "Forrest Gump" of representations for your data. Sending a dumbed-down message by Forrest Gump is good: he will relate it faithfully. Installing him as the only conduit for information is bad. You'll have to dumb down too much information and spend too much energy re-assembling it on the other side. > I wouldn't suggest that the DOM replace ODBC, yet I'm quite sure that those > experienced using a variety of systems with disparate data types and data > usages will appreciate that certain types of data are best expressed in tree > format. Such data scenario's might best be interfaced with via the DOM. You just need an API for "tree formats". Just ask your DBMS vendor to provide some tree-structured API. It doesn't matter if that API is the DOM because making it the DOM does *not buy you anything* as a programmer. >From a programming point of view there is no benefit to working with a consistent API where everything is dumbed-down to a textual model. You might as well dumb everything down to an "object model." (see below) If you buy this, then guess what the hype will be in three years: "These new fangled data bases have this really cool feature, dude. You call it with a SQL9X query and it can return like OBJECTS!. Everything in the world can be expressed as objects! Lists of objects. Lists of objects. Trees of objects. Directed graphs of objects. Arbitary graphs of objects. It like unifies everything as objects. It's Zen, man. They call it 'JDBC' and its totally wicked." The *only benefit* of unifying things as DOMs is reusing software that was originally supposed to work with XML (i.e. XSL implementations). If you are writing new software it makes NO SENSE to do it through a DOM interface unless your data source is *XML*. Otherwise, you should just define a "tree node" interface and have your various objects implement it. You will get all of the the benefits of the DOM with none of the costs (i.e. how the hell do you represent complex properties of objects???). If you want some good hints about what a "tree node" interface looks like, take a look at the grove abstraction. > XSL transforms can be applied directly to DOM representations, rather than > serialized XML documents. This yeilds the possibility that serial transforms > be applied within 'DOM space' (assuming the XSL transform output is a DOM > structure rather than a serialized string). The act, thus, of web page > generation from a database can be automated via XSL rather than, say ASP or > perl scripts. Is this useful? Sometimes it is. First, I don't believe that publishing databases to the Web is considered a hard problem. Report writers, CGIs and application servers have been doing it for a long time. So if all you are claiming is that the DOM provides us with a small ease-of-use gain in solving an already solved problem, then I can buy that. But I hear much grander claims for these DOM interfaces. Nobody is saying: "solve simple problems slightly more elegantly." They are rather saying: "unify your enterprise." Second, Even *XSL* is not best served by a DOM representation. James Clark wrote an xsl-list article about that but I can't find it now. Remember that the DOM was invented as an extension of "DHTML." It's only half "there." But if I grant that some well-thought-through API for XSL trees could exist (i.e. Jade's grove API) then I would propose that it only be used as an optimization in a system where it would otherwise make sense to pass around serializations of text documents. i.e. the DOM is okay for skipping a layer of message passing. It is not okay as a "universal API" for "all of the data in an organization." To bastardize JWZ: "Sometimes people have a hard data unification problem. One part of their organization speaks a very different language (at the data model and object model level) than another part. They might think 'I can unify these with XML or the DOM.' Now they have two problems." There second problem is that they didn't understand the really hard problem in their organization. Data model unification is *easy* (cast to java.lang.object or w3c.dom.node). Data model *rationalization* is very difficult. And I don't think that there are many shortcuts. > Are the DOM interfaces the best for all situations, clearly not. However if > a significant percentage of people can agree to use them a significant > percentage of the time, this is a big win. That's not going to happen. The DOM will NOT be a core tool for that majority of OO programmers this year, next year, or ever. Programmers will try it and increasingly find that if they are not doing XSL styling for the Web or print that the DOM is not a core tool. "Old-fashioned" OO can provide the same benefits. Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Wed Feb 3 10:08:29 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:08:28 2004 Subject: Another errata? Message-ID: <01BE4F64.A5B44010@grappa.ito.tu-darmstadt.de> Mark Birbeck wrote: > How not quite? A namespace is a set of unique entries *by definition*. > That's what they are - a set of unique entries. You can't have a 'set of > unique entries' that contains a duplicate. > > > An element can have the same name as a global > > attribute without problem. > > True. But they are not in the same namespace. According to A.2 the > element would be in the 'all element types' partition, and the global > attribute would be in the 'global attribute' partition. They *are* in the same "namespace", as the term is defined by the namespaces spec. The spec is, at times, quite forward about this -- for example, section A.1. At other times, you need to read very closely to determine whether the XML or traditional meaning of "namespace" is meant -- for example, in the paragraph describing per-element-type partitions, the first and second uses of the word "namespace" mean "traditional namespace"; the third usage means "XML namespace". That this is confusing is evident from the above discussion -- John means XML namespace and Mark means traditional namespace. > It is not the job of standards > developers to make sure we understand everything they write. ... Huh? It most certainly *is* the job of the standards developers to make sure we understand what they write. What is the point of a standard if nobody can understand it? Even more to the point, if what standards writers write is routinely interpreted to mean many different things by many different people, then I think the standards writers have failed their job. The SQL specs may make for abominable reading, but they are generally interpreted by everybody the same way. The XML spec is not the clearest piece of technical writing ever to come down the pipe, but after reading, and re-reading, and re-reading it, most people interpret most parts of it in the same way. In contrast, the namespaces spec *is* widely misinterpreted, and by people who, judging by their posts to this list, are intelligent and more than willing to read, re-read, and re-re-read specs. To me, that says there is something wrong, and I think a good example of this is the fact that the spec repeatedly leads the reader to believe that unprefixed attributes belong to the namespace of the element. I think a mistake made in writing many specifications is to rely on excessively formal language and write down only the rules, not the motivation. In my mind, the point of a specification is not to write rules, but to get everybody to agree to the same rules. (These are not quite the same thing -- think of the difference between the clue and the answer in a crossword puzzle. If you have a clue that immediately leads everybody to the same answer, then it is as useful as the answer, even though it is not the same.) Thus, anything that will get people to come to the same conclusion (stating the rules clearly, stating the motivation for those rules, giving examples, linking to video presentation of pet hamsters pantomiming the rules, etc.) is fair game. Finally, if you are driving a technology through standards (as opposed to the other way around, which is more common), then, whether you like it or not, those standards necessarily play a role in marketing that technology, and the more accessible those standards are, the more likely the technology will succeed. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Wed Feb 3 10:52:42 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:08:28 2004 Subject: Fw: Namespaces References: <008e01be4f52$a23b2410$5402a8c0@oren.capella.co.il> Message-ID: <36B8229D.A5FFD27B@jclark.com> Oren Ben-Kiki wrote: > xmlns="http://www.w3.org" > > > > > > would have the equivalent (extended) Clark encoding > (http://www.jclark.com/xml/xmlns.htm) > > <{http://www.w3.org}x xmlns:n1="http://www.w3.org" > xmlns="http://www.w3.org" > > <{http://www.w3.org}good {{http://wwww.w3.org}good}a="1" > {{http://wwww.w3.org}good}b="2" /> > <{http://www.w3.org}good {{http://wwww.w3.org}good}a="1" > {http://www.w3.org}a="2" > > It would have the (non-extended) Clark encoding of: <{http://www.w3.org}x> <{http://www.w3.org}good a="1" b="2" /> <{http://www.w3.org}good a="1" {http://www.w3.org}a="2" /> Putting {{http://wwww.w3.org}good} in front of the unprefixed attribute names needlessly confuses things. When I write in XML what is the relationship between the attribute name "a" and the element type name "good"? It's hard to describe but they're clearly not unrelated: roughly speaking knowing what the attribute name "a" means depends on knowing what the element type name "good" means. But we don't need to fight this one out: whatever this relationship is, when I write <{http://www.w3.org}good a="1"/> this relationship is exactly the same relationship as holds between the attribute name "a" and the element type name "{http://www.w3.org}good". XML namespaces aren't changing anything here. That's why <{http://www.w3.org}good a="1"/> not <{http://www.w3.org}good {{http://www.w3.org}good}a="1"/> is the right way to describe what's going on. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Wed Feb 3 11:04:03 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:08:28 2004 Subject: Fw: Fw: Namespaces Message-ID: <00b901be4f64$338cec30$5402a8c0@oren.capella.co.il> James Clark wrote: > > >what is the relationship between the attribute name "a" and the element >type name "good"? It's hard to describe but they're clearly not >unrelated: roughly speaking knowing what the attribute name "a" means >depends on knowing what the element type name "good" means. But we >don't need to fight this one out: whatever this relationship is, when I >write > ><{http://www.w3.org}good a="1"/> > >this relationship is exactly the same relationship as holds between the >attribute name "a" and the element type name "{http://www.w3.org}good". >XML namespaces aren't changing anything here. So far, so good. But the question is: is it the same relationship between "{http://www.w3.org}a" and "good" or "{http://www.w3.org}good" in: <{http://www.w3.org}good {http://www.w3.org}a="1"/> If it is the same relationship (I suspect it is), there's no such thing as global attributes, and a lot of bandwidth has been wasted debating them - or I misunderstood the debate :-) Share & Enjoy, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Wed Feb 3 11:09:45 1999 From: Michael.Kay at icl.com (Michael.Kay@icl.com) Date: Mon Jun 7 17:08:28 2004 Subject: object databases Message-ID: <93CB64052F94D211BC5D0010A80013310EB2D1@WWMESS3> Paul Prescod: > You didn't really answer my question. If Oracle 8i provides the same > support for "graphs of data with links and annotations" that > ObjectDesign then in what sense is it NOT an object database. I'm still > asking for a definition. There are a number of definitions of an "object database", but a minimal definition requires it to be a container of "objects" - which include behaviour (interfaces and methods) as well as structure. Plus a type hierarchy, plus persistent identifiers... Rather more than just graph structures. The main area where the "pure object" db vendors differ from the "extended relational" vendors is in insisting that the concept of object database also implies persistent programming, i.e. access to persistent objects using the same syntax as transient variables. Also they typically have a completely different internal architecture which optimises the performance of navigation and retrieval of composite objects at the expense of ad-hoc query. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at homepla.net Wed Feb 3 11:17:09 1999 From: peter at homepla.net (Peter Finch) Date: Mon Jun 7 17:08:28 2004 Subject: Namespace clashes? Message-ID: <36B82FBD.AC009CDD@homepla.net> > I have been disturbed by the amount of confusion surrounding the XML > Namespaces Recommendation. So I have written a document This was really good but I have some questions that have been nagging me for ages (sorry if they are dumb questions). Imagine A, B, C, D and E and all users of some XML data. A and B exchange data and agree to use A's namespace and that "title" is a book title (among other things). Elsewhere, C and D also exchange data and agree to use D's namespace and that "title" is a book title. 1. E comes along and wants to create data and exchange data with A, B, C and D. What does E use? If E creates a new NS "title" then A, B, C and D have to update there procedures to cater for this? 2. Is there going to be (or is there) a registry for URI and tags so that this can be avoided? 3. Is there some way to "alias" names so that you can say that "an:t", "dn:t" and "en:t" are all the same (or is this up to the application)? 4. How do you validate a document that may contain tag's and attributes that are a mixture of different DTD's? It's easy when they have the same content model but what if "an:t" and "dn:t" are containers and have completely different content models? Cheers, xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jayadeva at lgsi.co.in Wed Feb 3 11:25:23 1999 From: jayadeva at lgsi.co.in (Jayadeva Babu Gali) Date: Mon Jun 7 17:08:28 2004 Subject: unsubscribe Message-ID: <36B83229.8AACC184@lgsi.co.in> unsubscribe jayadeva@lgsi.co.in xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jayadeva at lgsi.co.in Wed Feb 3 11:27:47 1999 From: jayadeva at lgsi.co.in (Jayadeva Babu Gali) Date: Mon Jun 7 17:08:29 2004 Subject: unsubscribe Message-ID: <36B8324A.ED2B039C@lgsi.co.in> unsubscribe jayadeva@lgsi.co.in xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Wed Feb 3 11:54:36 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:08:29 2004 Subject: Namespace clashes? Message-ID: <01BE4F73.6064BA80@grappa.ito.tu-darmstadt.de> > Imagine A, B, C, D and E and all users of some XML data. A and B exchange > data and agree to use A's namespace and that "title" is a > book title (among other things). > > Elsewhere, C and D also exchange data and agree to use D's namespace and > that "title" is a book title. > > 1. E comes along and wants to create data and exchange data with A, B, C > and D. What does E use? If E creates a new NS "title" then > A, B, C and D have to update there procedures to cater for this? E should use to exchange data with A and B and to exchange data with C and D. If E creates their own DTD, then everybody who wants to exchange data with E needs to update their software, which is unlikely to make E very popular. A much better solution is to get everybody (A-E) to agree on a single set of tags in the first place. > 2. Is there going to be (or is there) a registry for URI and tags so that > this can be avoided? There are some DTD (tag) registries already -- for example, see www.schema.net. > 3. Is there some way to "alias" names so that you can say that "an:t", > "dn:t" and "en:t" are all the same (or is this up to the application)? I believe that architectural forms allow you to do this, but I'm not sure. > 4. How do you validate a document that may contain tag's and attributes that > are a mixture of different DTD's? It's easy when they have the same content > model but what if "an:t" and "dn:t" are containers and have completely > different content models? With current parsers, you need to make sure that the same prefixes are used in both the DTD and the document. That "an:t" and "dn:t" have different content models is not a problem for validation -- they are different elements and have different names. Differentiating between two different elements with the same name (whether they represent the same real-world information or not) is exactly the problem that namespaces were designed to solve. Note that an application needs to treat "an:t" and "dn:t" differently, just as it would treat "an:t" and "an:xxx" differently. The fact that both represent a title is the application's problem and must be solved in the application. As mentioned above, a simple (but not always easy) way to do this is to get A-E to agree on a single DTD. In general, namespaces will more likely be used to differentiate between elements from two different DTDs that cover different subject areas but use the same names, such as when means book title in one DTD and job title in another DTD. This becomes a problem when somebody creates a new DTD that reuses both of these DTDs. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Wed Feb 3 11:58:50 1999 From: ht at cogsci.ed.ac.uk (Henry S. Thompson) Date: Mon Jun 7 17:08:29 2004 Subject: Reminder: XML Research Dissemination Workshop 26 February in Edinburgh Message-ID: <1548.199902031156@mcilvanney.cogsci.ed.ac.uk> Research Dissemination Workshop: Markup Technologies for Computational Linguistics Hosted by the HCRC Language Technology Group University of Edinburgh 25, 26 February 1999 The Language Technology Group, with support from EPSRC, ESRC and other sources, has invested substantial effort over the last four years in building up an inventory of tools and technologies for the markup of language data. This in turn has led to the articulation of a markup-based architecture for NLP systems, which we have used for applications as diverse as discourse relation annotation, named entity recognition and tokenisation. The goal of this workshop is to introduce our work to a wider audience, with * A tutorial on XML, the W3C standard based on SGML which is at the heart of our work [No spaces left]; * Details of our use of markup technologies; * Comparison and contrast with similar technologies developed elsewhere. The keynote on Thursday evening will be given by Michael Sperberg-McQueen, and promises to be both entertaining and informative. See http://www.ltg.ed.ac.uk/nscope/ for details, including a pointer to the online registration form. -- Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh 2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk URL: http://www.ltg.ed.ac.uk/~ht/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From heretic at ihug.co.nz Wed Feb 3 12:43:18 1999 From: heretic at ihug.co.nz (David Mohring) Date: Mon Jun 7 17:08:29 2004 Subject: Compound Documents as directories stored in zip files Message-ID: <36B84467.DEB5215@ihug.co.nz> As Roger Costello defined compound documents in http://www.lists.ic.ac.uk/hypermail/xml-dev/9901/0754.html >compound-document ::= (compound-document | valid-document) >valid-document ::= <a document that conforms to a schema> >In words, a compound document is a "document of documents", where each >document conforms to a schema; i.e, a nested document conforms to a schema >as well as does its parent document. I will use the term composition and >compound document interchangeably. This limits the compound document to only being valid xml. In the real world most word processing compound documents also contain image files as well as other foreign data format. So what about giving compound document a wider definition? a "Compound Document" is a set of Documents and Objects that can refer and link to each other. The compound document can then contain any type of data format. valid xml files, DTDs , Schemes , image files etc. If you could 'unpack/unzip' a compound document you would produce a directory and files - just as in a normal file system and a root document that can define the view of the document as a Whole - index.xml or/and index.html. You can then relatively reference Xlink/Xpoint documents as easy as you would a directory of html files. So why not just store a compound document in zip achieve file format but with another affix just like java jar 'files'. It is easy to 'peer into' and 'grab' the content of a zip file, java classes and C libraries that can do this already exist. So why not just add this functionality to all XML applications, formatter, browsers etc. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Wed Feb 3 13:01:43 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:29 2004 Subject: Another errata? Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054943@SOHOS002> Ronald Bourret wrote: > Mark Birbeck wrote: > They *are* in the same "namespace", as the term is defined by the > namespaces spec. Where is the term defined? "XML namespace" is defined, and then from then on the word 'namespace' is used very loosely, but I so no 're-definition' of the term namespace away from its traditional meaning. I DO see an explanation of how the concept of an 'XML namespace' differs from a tradition 'namespace'. > That this is confusing is evident from the above discussion > -- John means > XML namespace and Mark means traditional namespace. I'm not confused thanks - I mean both. I am drawing a distinction. You need both to understand how the namespaces spec implements 'traditional namespaces' in a manner useful to XML. In short, one, traditional namespace would not be enough for XML because you lose structure. It therefore needs a lot of them, and this is achieved by having effectively a 'namespace container' full of other (real) namespaces. That namespace container is an 'XML namespace' - which for brevity, we call a 'namespace'. (Hey, I'm not disagreeing with you that it's tricky!!) > > It is not the job of standards > > developers to make sure we understand everything they write. ... > > <soapbox> > Huh? It most certainly *is* the job of the standards > developers to make > sure we understand what they write. I suppose we're onto matters of opinion now, but a standard must be unambiguous in its *formal* interpretation. I may read it and misunderstand, but when I get down to the nitty-gritty of implementation, provided that I eventually *do* understand it, I should be able to do this unambiguously. If you describe 'intent' then what if you do not cover a usage that later arises? You introduce ambiguity. > In contrast, the namespaces spec *is* widely misinterpreted, > and by people > who, judging by their posts to this list, are intelligent and > more than > willing to read, re-read, and re-re-read specs. To me, that > says there is > something wrong, and I think a good example of this is the > fact that the > spec repeatedly leads the reader to believe that unprefixed > attributes > belong to the namespace of the element. The spec does not! The reader's mistaken assumptions about what the namespaces spec is trying to achieve leads them to read this into it. But if we're really honest here, if we were in a discussion group on compiler technology ten years ago, you would not have such a wide range of people discussing these issues, and narrower range of misinterpretation (I'm not saying everyone understood everything then either). That's not to say we shouldn't have more people involved, but I disagree with this 'dumbing down' attitude that seems to exist, where we must ensure that everyone can understand. If you want to write a book making it clearer then do it - we'd all probably be grateful - but the spec itself MUST be a formal document. > I think a mistake made in writing many specifications is to rely on > excessively formal language and write down only the rules, not the > motivation. In my mind, the point of a specification is not to write > rules, but to get everybody to agree to the same rules. No! The job of the discussion *about* a spec is to find agreement. Once that has been arrived at you need to codify that in a way that is unambiguous. It needs to be as formal as possible! > Finally, if you are driving a technology through standards > (as opposed to > the other way around, which is more common), then, whether > you like it or > not, those standards necessarily play a role in marketing > that technology, > and the more accessible those standards are, the more likely > the technology > will succeed. I don't think that is the job of standards. Others can write books on it, produce training courses, and as you say, get hamsters involved in video production (although I believe that's illegal in some States) but the standard itself must be as terse and precise as possible. Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Wed Feb 3 13:28:05 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:08:29 2004 Subject: Namespaces References: <00b901be4f64$338cec30$5402a8c0@oren.capella.co.il> Message-ID: <36B84B7C.30BE274A@jclark.com> Oren Ben-Kiki wrote: > > James Clark <jjc@jclark.com> wrote: > > <good a="1"/> > > > >what is the relationship between the attribute name "a" and the element > >type name "good"? It's hard to describe but they're clearly not > >unrelated: roughly speaking knowing what the attribute name "a" means > >depends on knowing what the element type name "good" means. But we > >don't need to fight this one out: whatever this relationship is, when I > >write > > > ><{http://www.w3.org}good a="1"/> > > > >this relationship is exactly the same relationship as holds between the > >attribute name "a" and the element type name "{http://www.w3.org}good". > >XML namespaces aren't changing anything here. > > So far, so good. But the question is: is it the same relationship between > "{http://www.w3.org}a" > and "good" or "{http://www.w3.org}good" in: > > <good {http://www.w3.org}a="1"/> > <{http://www.w3.org}good {http://www.w3.org}a="1"/> I would prefer not to answer this since I don't think the XML Namespaces Recommendation needs to take a position on this. All the Namespaces Recommendation does is provide a mechanism which allows element type names and attribute names to be qualified with a URI. How other applications or specifications (such as RDF or XML Schemas) choose to exploit this mechanism is up to them. However, if I was forced to answer, I would say that the relationship was not the same. An element <{http://www.w3.org}good a="1"/> doesn't have a well-defined meaning unless the specification of the {http://www.w3.org}good element type says what the "a" attribute means on elements of that type. However an element <{http://www.w3.org}good {http://www.jclark.com}a="1"/> might have a well-defined meaning even if the specification of the {http://www.w3.org}good element type doesn't mention {http://www.jclark.com}a attribute and vice-versa, in just the same way that <{http://www.w3.org/}good> <{http://www.jclark.com/}a>1</{http://www.jclark.com}a> </{http://www.w3.org/}good> might have a well-defined meaning even if the specification of the {http://www.w3.org}good element type doesn't mention {http://www.jclark.com}a element type and vice-versa. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Wed Feb 3 13:29:40 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:29 2004 Subject: Fw: Namespaces Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054944@SOHOS002> Oren Ben-Kiki wrote: > So far, so good. But the question is: is it the same > relationship between > "{http://www.w3.org}a" > and "good" or "{http://www.w3.org}good" in: > > <good {http://www.w3.org}a="1"/> > <{http://www.w3.org}good {http://www.w3.org}a="1"/> > > If it is the same relationship (I suspect it is) there's no > such thing as global attributes It is the same, but that's because it IS a global attribute, with an explicit namespace: <good n1:a="1" /> <n1:good n1:a="1" /> Forgive me if I am being presumptuous, but I think what you meant to ask was is *this* relationship the same: <n1:good a="1" /> <n1:good n1:a="1" /> i.e., given that 'good' is in the 'n1' namespace, is the relationship between 'a' and 'good' the same in both cases, and I think you have to say it isn't. All you can say about 'a' DIRECTLY is that it is related to 'good' and that if you want to you could retrieve the namespace for 'good'. Using the expanded syntax from A.3 in the spec (sorry James, yours is nice too), we get: <ExpAName name='a' eltype="good" elns="http://www.w3.org" /> However, we can say more about 'n1:a': <ExpAName name='a' ns="http://www.w3.org" /> Note the latter is a global attribute, because even if we add: <n1:better a="1" /> <n1:better n1:a="1" /> Our complete expanded list becomes: <ExpEType name='good' ns="http://www.w3.org" /> <ExpEType name='better' ns="http://www.w3.org" /> <ExpAName name='a' eltype="good" elns="http://www.w3.org" /> <ExpAName name='a' eltype="better" elns="http://www.w3.org" /> <ExpAName name='a' ns="http://www.w3.org" /> That is, only ONE 'n1:a' entry, even though it's used in both 'good' and 'better'. In other words, it IS a global attribute. Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Wed Feb 3 14:48:03 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:08:29 2004 Subject: Forking the DOM (was Re: Storing Lots of Fiddly Bits) In-Reply-To: <36B819D3.B65C07F3@prescod.net> References: <000e01be4f36$c7b29370$d3228018@jabr.ne.mediaone.net> Message-ID: <199902031447.JAA20034@hesketh.net> Given the fairly strong comments excerpted below (and Paul's not the only one muttering like this), is it time to contemplate a very different API? The DOM's strong suit is that it provides a standard interface; however, that standard seems to keep running into a mismatch problem in lots of situations. Is a generalized object model simply not the answer? Do we need fifteen semi-standard models for use in different situations? Could the current DOM be subsetted/extended to provide such functionality, or do we need to take a few steps back and start over, leaving the DOM for those who need its type of functionality but defining a new set of rules for those with different needs? At 03:41 AM 2/3/99 -0600, Paul Prescod wrote: >You just need an API for "tree formats". Just ask your DBMS vendor to >provide some tree-structured API. It doesn't matter if that API is the DOM >because making it the DOM does *not buy you anything* as a programmer. >>From a programming point of view there is no benefit to working with a >consistent API where everything is dumbed-down to a textual model. You >might as well dumb everything down to an "object model." (see below) > >[...] > >The *only benefit* of unifying things as DOMs is reusing software that was >originally supposed to work with XML (i.e. XSL implementations). If you >are writing new software it makes NO SENSE to do it through a DOM >interface unless your data source is *XML*. > >Otherwise, you should just define a "tree node" interface and have your >various objects implement it. You will get all of the the benefits of the >DOM with none of the costs (i.e. how the hell do you represent complex >properties of objects???). If you want some good hints about what a "tree >node" interface looks like, take a look at the grove abstraction. > >[...] > >Second, Even *XSL* is not best served by a DOM representation. James Clark >wrote an xsl-list article about that but I can't find it now. Remember >that the DOM was invented as an extension of "DHTML." It's only half >"there." > >But if I grant that some well-thought-through API for XSL trees could >exist (i.e. Jade's grove API) then I would propose that it only be used as >an optimization in a system where it would otherwise make sense to pass >around serializations of text documents. i.e. the DOM is okay for skipping >a layer of message passing. It is not okay as a "universal API" for "all >of the data in an organization." > >[...] > >That's not going to happen. The DOM will NOT be a core tool for that >majority of OO programmers this year, next year, or ever. Programmers will >try it and increasingly find that if they are not doing XSL styling for >the Web or print that the DOM is not a core tool. "Old-fashioned" OO can >provide the same benefits. Simon St.Laurent XML: A Primer / Building XML Applications (March) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From DuCharmR at moodys.com Wed Feb 3 15:10:11 1999 From: DuCharmR at moodys.com (DuCharme, Robert) Date: Mon Jun 7 17:08:29 2004 Subject: XML and databases (NYC talk 2/8) Message-ID: <49092BAEAC84D2119B0600805FD40F9F120C8F@MDYNYCMSX1> The Object Developers Group (www.objdev.org) is sponsoring a talk on XML and databases in New York City on February 8th. I've copied their announcement below. (It was forwarded to my by a co-worker; I know nothing about the ODG or Walter Perry.) Bob DuCharme www.snee.com/bob <bob@ snee.com> see www.snee.com/bob/xmlann for "XML: The Annotated Specification" from Prentice Hall. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Title: "XML and Databases" Speaker: Walter Perry Date: Monday, February 8, 1999 Time: 7-9pm Location: Prudential Securities, One New York Plaza, NY NY 10292 Sign in with security, take elevator to the 3rd floor, then the escalator down to 2nd floor. Follow posted signs. See http://www.objdev.org/directions/prudential.html Cost: $5 contribution http://www.objdev.org/contrib Free to Prudential Security employees (our hosts) and ODG paid annual members Register: Required, please use http://www.objdev.org/register Leader: Walter Perry, wperry@fiduciary.org Groups: XML SIG of the Object Developers Group(ODG) Abstract: "XML and Databases" XML, touted as a 'universal syntax' of information, has seemed from its beginnings to be an obvious tool for inter-database communication, for database publishing to the Web, for the mapping of complex data to relational tables -- in short, for expressing efficient translations between data and documents. The question is how large a role XML can, or should, play in the overall scheme of data management. Some think that the data repository itself should be XML documents, while others at the opposite extreme regard XML as simply a transient messaging format. Is an XML document logically reducible to normalized relational tables, or does it force us to use much less well understood object databases? What are the advantages, and the shortcomings, of XML for such routine database tasks as queries, sorts, bulk updates and reporting? This session will examine these questions and try to find several ways in which XML can provide database tools which are immediately useful. Bio: Walter Perry is Managing Director of Fiduciary Automation, which provides services and software to support complex transnational financial settlements and to produce from them multi-jurisdictional reporting satisfying management, auditing, compliance and tax and other regulatory requirements. He has wrestled with most hardware since the 360/45 and the PDP-8 and with an even wider variety of operating systems and languages. He is now happily object-oriented. Walter holds a B.A. from Columbia University and a PhD. from Trinity College, University of Dublin. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Wed Feb 3 15:13:28 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:08:29 2004 Subject: Forking the DOM (was Re: Storing Lots of Fiddly Bits) Message-ID: <5F052F2A01FBD11184F00008C7A4A80001136AEB@eukbant101.ericsson.se> > -----Original Message----- > From: Simon St.Laurent [SMTP:simonstl@simonstl.com] > > Given the fairly strong comments excerpted below (and Paul's not the only > one muttering like this), is it time to contemplate a very different API? > The DOM's strong suit is that it provides a standard interface; however, > that standard seems to keep running into a mismatch problem in lots of > situations. > > Is a generalized object model simply not the answer? Do we need fifteen > semi-standard models for use in different situations? Could the current > DOM be subsetted/extended to provide such functionality, or do we need to > take a few steps back and start over, leaving the DOM for those who need > its type of functionality but defining a new set of rules for those with > different needs? > Don't we already _have_ different models depending on what suits the situation best? I am coming from the perl world, not the Java world that most people here are in, but there we have DOM, Groves, Objects, Trees, etc. If DOM doesn't fit (which for a lot of tasks it doesn't) then there are other options out there. I don't think trying to define another language independant representation of XML is the answer here - otherwise we'll end up with X standards for X different problems. I think you're right, but I'd rephrase slightly: "A generalized object model isn't *always* the answer". Use what works, and if there's nothing that works create something that does - and be nice - let everyone else have it. Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Wed Feb 3 18:30:46 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:29 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054949@SOHOS002> Paul Prescod wrote: > > Sure. But I still have two issues. First, why would you query the > > serialisation anyway? Wouldn't you want to query your > original database > > and generate XML pages that reflect the results? > > You certainly would if you use XML as *only* a serialization. > The thrust > of this thread was that some people want to encode everything > in XML so > that they can "query it." But XML is a lousy query representation for > anything other than human-authored documents. (and debatably the best > thing for queries against those!) That is my suggestion. The DOM is no way fast enough - or efficient enough - to be a query interface to loads of data. It is a handy way of navigating trees though, but you would probably want to break your processing of the data into a number of trees. For example, say I query *my database* for all articles in any issue of a magazine that contain the word 'Turkey'. The database server would now have these cached, but to begin with I might get away with just putting references to each issue into the DOM. If a user selects one of those issues from their search results, I could then add to the DOM all the articles for that issue that contain the word. I could even create a new DOM instance and populate it, so that if the user moves away from that issue I could delete the DOM and create a new one. (I could even not use the DOM at all - hey, there's no law.) Anyway, I think we're sort of agreeing, that the DOM on its own is not suitable, but some helper stuff with the DOM is. > > And modelling the data rather than the person > > means you can no longer interchange your XML with other > systems because > > you have two completely different sets of data, using > different DTDs. > > I don't follow that. I simply mean that an XML document that contains data about people has a different DTD to a document that has data about data. A server expecting a 'person' document that meets with some DTD requirements is not going to accept a document that matches the DTD for 'global data interchange'. > > (And you can't say that your serialisation schema *will* allow this > > interchange, because although your serialised data may be > well-formed, > > the underlying data it represents may not be, so you need > the proper DTD > > for the object.) > > Well-formedness has very little to do with DTDs so I don't follow this > either. I mean that you could have an XML document that fails against its DTD - say be using an attribute in the wrong place. Now, if you devise something that serialises your data so that you have entries that help define your elements and attributes, that serialised data will *pass* against *its* DTD - because its DTD is different (its the one for data serialisation). Now when you pass this serialised data to another server, it should be matched against its original DTD, otherwise you won't know that its badly formed, but with a 'universal serialiser' you could actually import it into a new database, despite its failings. > > All I am saying is that the document *itself* could be the > abstraction > > of the data. > > This is something else I don't follow. XML documents are > always encodings > of abstractions. They are concrete, tangible, interchangable, > printable > and can be given global names. Concrete, not abstract. I suppose all I'm getting at is that XML is already data', so why do we need to go to data'' in order to serialise? Why not just serialise from the database to XML? This is not the same problem as transferring schemas around. > The objects they represent are logical, usually inaccessible > outside of an > "address space" (i.e. your brain, your relational database) > and are thus > termed abstract. The reason we need XSL is because the > abstractions cannot > "stand alone". I can't transmit a book from my head to your > head. I need > to serialize it on paper or online. I also can't transmit a > "book object" > without serializing it somehow (i.e. XML). Before > serialization it is an > abstraction. Perhaps we are using the terms differently. If I have a picture of a house and I show it to you and say, point to the window, you would do so. But what you are pointing to is an 'abstraction' of the house - a picture - and there is no window there! ("ceci n'est pas une pipe", and all that.) Now, my point is that transmitting an XML mapping of some database entries is like actually transmitting that picture itself - of course it is not the house, but it *is* a representation of it. But it seems to me that to serialise everything to a universal form, always using the same DTD, is to end up transmitting a representation of the *picture*, not the house. And then you have lost a lot of information. And worse, you can now only send you data to systems that process abstractions of pictures, not ones that process abstractions of houses. Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From skshirsa at nortelnetworks.com Wed Feb 3 18:31:33 1999 From: skshirsa at nortelnetworks.com (Shekhar Kshirsagar) Date: Mon Jun 7 17:08:29 2004 Subject: What is the purpose of ANY keyword ? Message-ID: <3.0.32.19990203115642.00740234@bl-mail2.corpeast.baynetworks.com> Hi, What's the purpose of ANY keyword in a otherwise strictly formed syntax of XML ? Also is this right if I say that : "If a top level element of a DTD has a content specifier - ANY, all the documents will be well-formed and VALID, as far as they contain the elements from that DTD in any order." Thanks & Regards, Shekhar Kshirsagar Nortel Networks. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clovett at microsoft.com Wed Feb 3 18:31:46 1999 From: clovett at microsoft.com (Chris Lovett) Date: Mon Jun 7 17:08:29 2004 Subject: how to browse xml files with xsl style sheet LOCALLY ? Message-ID: <2F2DC5CE035DD1118C8E00805FFE354C0874477A@RED-MSG-56> 80070005 is "Access Denied" - probably trying to load the DTD. Make sure you really did copy down the DTD and that all the URL's are relative local file access. ie5b2 xml does have a rather over-paranoid security model which will be fixed in the next release. If you still can't get it to work you can tweak the following security settings: - "Initialize and script ActiveX controls not marked as safe" - "Access data sources across domains" -----Original Message----- From: Mathieu Mangeot Lerebours [mailto:Mathieu.Mangeot@xrce.xerox.com] Sent: Tuesday, February 02, 1999 8:43 AM To: xml-dev@ic.ac.uk Cc: mangeot@xrce.xerox.com Subject: how to browse xml files with xsl style sheet LOCALLY ? Hello, I'm trying to write a xsl for my xml documents. As I'm not an expert in xsl, I wanted to start from an example. I'm working with ie5b2. I found an example at this address : http://www.silab.dsi.unimi.it/~sz475745/etl/rivista/Sommario.xml I can browse it perfectly. Then I decided to copy this file on my local disk. I also copyed the dtd and the xsl files on my local disk. But when I browse the local copy, msxml generates an error : =================================================== The XML page cannot be displayed Cannot view XML input using XSL style sheet. Please correct the error and then click the Refresh button, or try again later. ---------------------------------------------------------------------------- ---- MSXML error detected: 80070005 Line 4, Position 1 <root> ^ ================================================= I tried with another example from www.microsoft.com and encountered the same problem. Next, there was the same example but with css. This time, I could browse my local copy without any problem. So : What can I do to browse xml files with xsl locally ? thank you for your answers Mathieu -- Mathieu MANGEOT-LEREBOURS | Phone : +33 4 76 61 51 32 Xerox Research Centre Europe | Fax : +33 4 76 61 50 99 6 chemin de Maupertuis | E-mail: Mathieu.Mangeot@imag.fr F-38240 Meylan FRANCE | http://www.xrce.xerox.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Wed Feb 3 18:33:35 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:30 2004 Subject: Fw: Namespaces Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054947@SOHOS002> Clark - hope you don't mind but I'm copying this to the group because it may be of interest. Clark Evans wrote: > Mark Birbeck wrote: > > Using the expanded syntax from A.3 in the spec (sorry James, > > yours is nice too), we get: > > Could you translate into the syntax James put forth, I think > it much more clearly illustrates what's going on and > thus better facilitates conversation. Having to mentally translate > what you wrote caused me to get a headache. *evil grin* > > :) Clark > Sorry about your head Clark, but you can't translate because they express two different things. James's syntax allows us to see what's going on in an XML document by spelling out the namespaces to which elements and attributes belong. So (from my email): <xml xmlns:n1="http://www.w3c.org" /> <n1:good a="1" /> <n1:good n1:a="1" /> <n1:better a="1" /> <n1:better n1:a="1" /> becomes: <{http://www.w3c.org}good a="1" /> <{http://www.w3c.org}good {http://www.w3c.org}a="1" /> <{http://www.w3c.org}better a="1" /> <{http://www.w3c.org}better {http://www.w3c.org}a="1" /> in James's syntax. However, I was expressing what position those elements and attributes occupy in the XML namespace (again, from my previous email): > Our complete expanded list becomes: > > <ExpEType name='good' ns="http://www.w3.org" /> > <ExpEType name='better' ns="http://www.w3.org" /> > <ExpAName name='a' eltype="good" elns="http://www.w3.org" /> > <ExpAName name='a' eltype="better" elns="http://www.w3.org" /> > <ExpAName name='a' ns="http://www.w3.org" /> Note that 'a' on 'good' has a unique entry, as does 'a' on 'better', but in James's syntax we have 'a="1"' for both - i.e. no differentiation. James's syntax is therefore a useful 'exploded XML' but has to be used 'in context' - because it is still expressing the original XML document. But it does not tell us everything about the namespace (within the XML namespace) that an element or attribute occupies. The expanded attribute/element syntax loses information about the original XML document, but *does* map the namespace structure. As a shorthand in a previous email I used '^' between the various parts that tell you where in the XML namespace structure an element or attribute appears. So I could have expressed the above using: AETP^http://www.w3.org^good AETP^http://www.w3.org^better PETP^http://www.w3.org^good^a PETP^http://www.w3.org^better^a GAP^http://www.w3.org^a where: AETP is the 'all element types' partition PETP is the 'per-element-type' partition GAP is the 'global attribute' partition Note that we have four namespaces (i.e. sets) within the greater XML namespace. These are: AETP PETP^http://www.w3.org^good PETP^http://www.w3.org^better GAP Not sure if that helps your headache, but it *is* a more accurate representation of what is happening! Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eric at hellman.net Wed Feb 3 18:36:12 1999 From: eric at hellman.net (Eric Hellman) Date: Mon Jun 7 17:08:30 2004 Subject: Control Characters Message-ID: <v04020a04b2de1fe26493@[192.168.1.1]> Why are the control characters x80-x9F allowed in XML character data, while x0-x8,xB,xC,xE-x1F are illegal? Is it that the illegals have meanings that XML does not support? Just wondering. Has "BEL" been banished to a "Unisound" encoding ?;--} I have a DTD in which some entities are given the default value xFFFD in the external subset because they are placeholders for strings to be supplied in an internal subset with the document. Is this an appropriate use of Unicode's "replacement character"? Eric Eric Hellman Openly Informatics, Inc. http://www.openly.com/ Tools for 21st Century Scholarly Publishing xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From skshirsa at nortelnetworks.com Wed Feb 3 18:49:25 1999 From: skshirsa at nortelnetworks.com (Shekhar Kshirsagar) Date: Mon Jun 7 17:08:30 2004 Subject: Can attribute value be of type ELEMENT ? Message-ID: <3.0.32.19990203133733.0075e040@bl-mail2.corpeast.baynetworks.com> Hi, In some tables, the unique identifier key be formed using more than one column values in a particular sequence . If I want to map it to attribute of type ID, what is the best way to do it ? One way to do that might be Attribute Value will be defined as a ELEMENT. But is it valid to do so ? Thanks & Regards, Shekhar Kshirsagar Nortel Networks xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Wed Feb 3 19:06:18 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:30 2004 Subject: Another errata? References: <A26F84C9D8EDD111A102006097C4CD0D054943@SOHOS002> Message-ID: <36B89820.18073A33@infinet.com> Mark Birbeck wrote: > Ronald Bourret wrote: > > That this is confusing is evident from the above discussion > > -- John means > > XML namespace and Mark means traditional namespace. > > I'm not confused thanks - I mean both. I am drawing a distinction. You > need both to understand how the namespaces spec implements 'traditional > namespaces' in a manner useful to XML. In short, one, traditional > namespace would not be enough for XML because you lose structure. It > therefore needs a lot of them, and this is achieved by having > effectively a 'namespace container' full of other (real) namespaces. > That namespace container is an 'XML namespace' - which for brevity, we > call a 'namespace'. (Hey, I'm not disagreeing with you that it's > tricky!!) Not only tricky, but unnecessarily tricky. XML should be a tool for real-world business solutions, not something that is more or less a puzzle some CS professor designs for his students. Namespaces is a pain to deal with no matter how you look at it. Once you read a document into memory and no longer preserve the original prefixes (or rather the QName), when you write the document back out (which has possibly been mutated) where do you get these prefixes? Do you simply invent them in the form a, b, c, ..., aa, ab, ac, ... etc. I suppose the people in the "Namespaces in XML" feel that once you read in a document, you throw it away. Under the old PI-Proposal it would be trivial for an XML API to allow the user to assign a prefix to a namespace before writing out an XML Document. Now it is far more difficult and practically impossible to deal with at the application level in a manageable way for the end-user. I suppose only parsing matters anymore with respect to XML and writing out XML documents no longer matters. This is sad as it effectively makes XML useless for my needs in an application I have. > > > It is not the job of standards > > > developers to make sure we understand everything they write. ... > > > > <soapbox> > > Huh? It most certainly *is* the job of the standards > > developers to make > > sure we understand what they write. > > I suppose we're onto matters of opinion now, but a standard must be > unambiguous in its *formal* interpretation. I may read it and > misunderstand, but when I get down to the nitty-gritty of > implementation, provided that I eventually *do* understand it, I should > be able to do this unambiguously. > > If you describe 'intent' then what if you do not cover a usage that > later arises? You introduce ambiguity. If people cannot understand what you write because the ideas are too complex for the average developer, then there is no real point in having a standard in the first place because all that ISV's will be doing is commoditizing their products for a niche market and overall this will reduce innovation in the marketplace. In the end everyone loses here. However, for broadly adopted technologies, standards are very important for application to application interoperability which is something end-users clamor for. But if things are too hard to understand because either the concepts have not been simplified or else the designers of those concepts don't care about clarity, then this does no one any good. > > In contrast, the namespaces spec *is* widely misinterpreted, > > and by people > > who, judging by their posts to this list, are intelligent and > > more than > > willing to read, re-read, and re-re-read specs. To me, that > > says there is > > something wrong, and I think a good example of this is the > > fact that the > > spec repeatedly leads the reader to believe that unprefixed > > attributes > > belong to the namespace of the element. > > The spec does not! The reader's mistaken assumptions about what the > namespaces spec is trying to achieve leads them to read this into it. > But if we're really honest here, if we were in a discussion group on > compiler technology ten years ago, you would not have such a wide range > of people discussing these issues, and narrower range of > misinterpretation (I'm not saying everyone understood everything then > either). That's not to say we shouldn't have more people involved, but I That is because compiler technology is a totally different field (and much more complex field) than XML will ever be. If XML is to be used by only developers and end-users who understand compiler technology then it will fail miserably no matter how much hype is in the presses these days. > disagree with this 'dumbing down' attitude that seems to exist, where we > must ensure that everyone can understand. If you want to write a book > making it clearer then do it - we'd all probably be grateful - but the > spec itself MUST be a formal document. I don't know what your definition of genius is, but mine is simple "the ability to simplify the complex". Quite plainly, if you can make reduce the learning curve for our feeble human minds to understand, it will take less time for people to learn those concepts and they can then take the time they have saved and work on new and interesting tasks. If every generation has to spend most of their adult life just learning everything that the previous generation created (but did not simplify) we will never have progress because human beings will be very old and very grey before they ever get the opportunity to even think original thoughts. > > I think a mistake made in writing many specifications is to rely on > > excessively formal language and write down only the rules, not the > > motivation. In my mind, the point of a specification is not to write > > rules, but to get everybody to agree to the same rules. > > No! The job of the discussion *about* a spec is to find agreement. Once > that has been arrived at you need to codify that in a way that is > unambiguous. It needs to be as formal as possible! If everyone told you to jump off a cliff would you do it? Compromise on technical matters is one of the worst things you can do. I personally feel you should let everyone come up with their own implementations and let public opinion and the marketplace decide who is the best. Compromise usually (but not always) creates groupthink and an atmosphere that very sub-optimal decisions are OK "as long as we can all just get along". > > Finally, if you are driving a technology through standards > > (as opposed to > > the other way around, which is more common), then, whether > > you like it or > > not, those standards necessarily play a role in marketing > > that technology, > > and the more accessible those standards are, the more likely > > the technology > > will succeed. > > I don't think that is the job of standards. Others can write books on > it, produce training courses, and as you say, get hamsters involved in > video production (although I believe that's illegal in some States) but > the standard itself must be as terse and precise as possible. You think people read all of these books and take these training courses because they want to? They take them usually because the technologies they use today are dictated to them. Java's success is an example of people writing code in Java because the "wanted to" because Java took out 85% of the pain in C++. Nevertheless, I have plenty of friends who still write C++ code on new projects using MS Visual C++ because "old grannies" at their institutions think everyone should be doing things they way they did them 10-15 years ago. They can't see the benefits of Java being a simpler version of C++, because they have not made an effort to do so. If you write some spec without regard to simplicity you are falling into a similiar model. "Namespaces in XML" is a pure example of simplicity falling to the wayside in favor of the attitude "well if I can understand it and no one else can, then the whole world is stupid". I think this will change in the next 10 years as people will become fed-up with any company or standards body that does not make a genuine effort at creating simple solutions from the get-go, especially since the trend in business these days is to be leaner and meaner. Microsoft has been eating up the small to medium sized business market with NT because they produce simpler solutions to SUN and IBM even though their OS's may crash all of the time. Even though their customer base is for the most part may be unhappy with their products, finding MS Solution Providers is a lot easier (and cheaper) than finding UNIX engineers. Recently companies like SUN and IBM may be getting the drift, but it may be too little too late. But then again, MS is seemingly doomed to fail big with NT 5.0 (managing 40 million LOC in one build is just plain mind-boggling) so who knows. I just know that my faith in the W3C would be seriously rekindled if "Namespaces in XML" were ripped off the W3C website right now and they started over with a more open-process to create a more well-thought out solution. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From skshirsa at nortelnetworks.com Wed Feb 3 19:17:22 1999 From: skshirsa at nortelnetworks.com (Shekhar Kshirsagar) Date: Mon Jun 7 17:08:30 2004 Subject: What is the purpose of ANY keyword ? Message-ID: <3.0.32.19990203141241.0073abb4@bl-mail2.corpeast.baynetworks.com> Hi, What's the purpose of ANY keyword in an otherwise strictly formed syntax of XML ? Also is this right if I say that : "If a top level element of a DTD has a content specifier - ANY, all the documents will be well-formed and VALID, as far as they contain the elements from that DTD in any order." Thanks & Regards, Shekhar Kshirsagar Nortel Networks. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Feb 3 19:38:24 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:08:30 2004 Subject: Control Characters References: <v04020a04b2de1fe26493@[192.168.1.1]> Message-ID: <36B8A589.ECEF3D62@locke.ccil.org> Eric Hellman scripsit: > Why are the control characters x80-x9F allowed in XML character data, while > x0-x8,xB,xC,xE-x1F are illegal? No good reason. I think there was a desire to keep the SGML declaration short. > Has "BEL" been banished to a "Unisound" encoding ?;--} I think it would be most useful to replace it with an empty element <BEL/>. > I have a DTD in which some entities are given the default value xFFFD in > the external subset because they are placeholders for strings to be > supplied in an internal subset with the document. Is this an appropriate > use of Unicode's "replacement character"? It's unusual but not erroneous. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From skshirsa at nortelnetworks.com Wed Feb 3 19:46:37 1999 From: skshirsa at nortelnetworks.com (Shekhar Kshirsagar) Date: Mon Jun 7 17:08:30 2004 Subject: What is the purpose of ANY keyword ? Message-ID: <3.0.32.19990203115642.00740234@bl-mail2.corpeast.baynetworks.com> Hi, What's the purpose of ANY keyword in a otherwise strictly formed syntax of XML ? Also is this right if I say that : "If a top level element of a DTD has a content specifier - ANY, all the documents will be well-formed and VALID, as far as they contain the elements from that DTD in any order." Thanks & Regards, Shekhar Kshirsagar Nortel Networks. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Wed Feb 3 19:49:33 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:30 2004 Subject: Fw: Namespaces Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054947@SOHOS002> Clark - hope you don't mind but I'm copying this to the group because it may be of interest. Clark Evans wrote: > Mark Birbeck wrote: > > Using the expanded syntax from A.3 in the spec (sorry James, > > yours is nice too), we get: > > Could you translate into the syntax James put forth, I think > it much more clearly illustrates what's going on and > thus better facilitates conversation. Having to mentally translate > what you wrote caused me to get a headache. *evil grin* > > :) Clark > Sorry about your head Clark, but you can't translate because they express two different things. James's syntax allows us to see what's going on in an XML document by spelling out the namespaces to which elements and attributes belong. So (from my email): <xml xmlns:n1="http://www.w3c.org" /> <n1:good a="1" /> <n1:good n1:a="1" /> <n1:better a="1" /> <n1:better n1:a="1" /> becomes: <{http://www.w3c.org}good a="1" /> <{http://www.w3c.org}good {http://www.w3c.org}a="1" /> <{http://www.w3c.org}better a="1" /> <{http://www.w3c.org}better {http://www.w3c.org}a="1" /> in James's syntax. However, I was expressing what position those elements and attributes occupy in the XML namespace (again, from my previous email): > Our complete expanded list becomes: > > <ExpEType name='good' ns="http://www.w3.org" /> > <ExpEType name='better' ns="http://www.w3.org" /> > <ExpAName name='a' eltype="good" elns="http://www.w3.org" /> > <ExpAName name='a' eltype="better" elns="http://www.w3.org" /> > <ExpAName name='a' ns="http://www.w3.org" /> Note that 'a' on 'good' has a unique entry, as does 'a' on 'better', but in James's syntax we have 'a="1"' for both - i.e. no differentiation. James's syntax is therefore a useful 'exploded XML' but has to be used 'in context' - because it is still expressing the original XML document. But it does not tell us everything about the namespace (within the XML namespace) that an element or attribute occupies. The expanded attribute/element syntax loses information about the original XML document, but *does* map the namespace structure. As a shorthand in a previous email I used '^' between the various parts that tell you where in the XML namespace structure an element or attribute appears. So I could have expressed the above using: AETP^http://www.w3.org^good AETP^http://www.w3.org^better PETP^http://www.w3.org^good^a PETP^http://www.w3.org^better^a GAP^http://www.w3.org^a where: AETP is the 'all element types' partition PETP is the 'per-element-type' partition GAP is the 'global attribute' partition Note that we have four namespaces (i.e. sets) within the greater XML namespace. These are: AETP PETP^http://www.w3.org^good PETP^http://www.w3.org^better GAP Not sure if that helps your headache, but it *is* a more accurate representation of what is happening! Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clovett at microsoft.com Wed Feb 3 19:52:07 1999 From: clovett at microsoft.com (Chris Lovett) Date: Mon Jun 7 17:08:30 2004 Subject: how to browse xml files with xsl style sheet LOCALLY ? Message-ID: <2F2DC5CE035DD1118C8E00805FFE354C0874477A@RED-MSG-56> 80070005 is "Access Denied" - probably trying to load the DTD. Make sure you really did copy down the DTD and that all the URL's are relative local file access. ie5b2 xml does have a rather over-paranoid security model which will be fixed in the next release. If you still can't get it to work you can tweak the following security settings: - "Initialize and script ActiveX controls not marked as safe" - "Access data sources across domains" -----Original Message----- From: Mathieu Mangeot Lerebours [mailto:Mathieu.Mangeot@xrce.xerox.com] Sent: Tuesday, February 02, 1999 8:43 AM To: xml-dev@ic.ac.uk Cc: mangeot@xrce.xerox.com Subject: how to browse xml files with xsl style sheet LOCALLY ? Hello, I'm trying to write a xsl for my xml documents. As I'm not an expert in xsl, I wanted to start from an example. I'm working with ie5b2. I found an example at this address : http://www.silab.dsi.unimi.it/~sz475745/etl/rivista/Sommario.xml I can browse it perfectly. Then I decided to copy this file on my local disk. I also copyed the dtd and the xsl files on my local disk. But when I browse the local copy, msxml generates an error : =================================================== The XML page cannot be displayed Cannot view XML input using XSL style sheet. Please correct the error and then click the Refresh button, or try again later. ---------------------------------------------------------------------------- ---- MSXML error detected: 80070005 Line 4, Position 1 <root> ^ ================================================= I tried with another example from www.microsoft.com and encountered the same problem. Next, there was the same example but with css. This time, I could browse my local copy without any problem. So : What can I do to browse xml files with xsl locally ? thank you for your answers Mathieu -- Mathieu MANGEOT-LEREBOURS | Phone : +33 4 76 61 51 32 Xerox Research Centre Europe | Fax : +33 4 76 61 50 99 6 chemin de Maupertuis | E-mail: Mathieu.Mangeot@imag.fr F-38240 Meylan FRANCE | http://www.xrce.xerox.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eric at hellman.net Wed Feb 3 19:58:03 1999 From: eric at hellman.net (Eric Hellman) Date: Mon Jun 7 17:08:30 2004 Subject: Control Characters Message-ID: <v04020a04b2de1fe26493@[192.168.1.1]> Why are the control characters x80-x9F allowed in XML character data, while x0-x8,xB,xC,xE-x1F are illegal? Is it that the illegals have meanings that XML does not support? Just wondering. Has "BEL" been banished to a "Unisound" encoding ?;--} I have a DTD in which some entities are given the default value xFFFD in the external subset because they are placeholders for strings to be supplied in an internal subset with the document. Is this an appropriate use of Unicode's "replacement character"? Eric Eric Hellman Openly Informatics, Inc. http://www.openly.com/ Tools for 21st Century Scholarly Publishing xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Wed Feb 3 20:04:50 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:30 2004 Subject: Another errata? Message-ID: <A26F84C9D8EDD111A102006097C4CD0D05494C@SOHOS002> Tyler Baker wrote: > Mark Birbeck wrote: > > It therefore needs a lot of them, and this is achieved by having > > effectively a 'namespace container' full of other (real) namespaces. > > That namespace container is an 'XML namespace' - which for > brevity, we > > call a 'namespace'. (Hey, I'm not disagreeing with you that it's > > tricky!!) > > Not only tricky, but unnecessarily tricky. I mean the words used are tricky, because namespace is used to mean 'real namespace' and 'XML namespace' interchangeably. But the context usually makes it clear which is implied. I'm *not* saying that using namespaces is tricky. Actually it's pretty easy. > XML should be a > tool for real-world business > solutions, not something that is more or less a puzzle some > CS professor designs for his > students. Yep. So how isn't it? Seems to be pretty real-world to me. > Namespaces is a pain to deal with no matter how > you look at it. Once you read a > document into memory and no longer preserve the original > prefixes (or rather the QName), when > you write the document back out (which has possibly been > mutated) where do you get these > prefixes? Why would you not preserve the prefixes? This is a bizarre point. If I delete the contents of my hard-drive shouldn't my word processor still open my letters? Of course not. As it happens, you couldn't delete the prefix anyway, because they are part of the names of the elements and attributes. Think of it like an XML 1.0 document - would you delete all characters before a colon in all names? Of course you wouldn't. Effectively you have an XML 1.0 document that conforms to the namespace spec if all 'Names' actually conform to 'QNames' (or in some cases NCNames). All that happens with namespaces is that you have an additional way of viewing your nodes over and above that defined in XML 1.0, and that is by namespace. > I suppose > the people in the "Namespaces in XML" feel that once you read > in a document, you throw it > away. They also believe the moon is made of cheese. > If people cannot understand what you write because the ideas > are too complex for the average > developer, then there is no real point in having a standard If all you ever develop is things that can be understood by the average developer then this discussion group would not exist since XML would not exist, and nor would space travel, electricity, and just about every philosophy ever invented. > > But if we're really honest here, if we were in a discussion group on > > compiler technology ten years ago, you would not have such > a wide range > > of people discussing these issues, and narrower range of > > misinterpretation (I'm not saying everyone understood > everything then > > either). That's not to say we shouldn't have more people > involved, but I > > That is because compiler technology is a totally different > field (and much more complex field) > than XML will ever be. If XML is to be used by only > developers and end-users who understand > compiler technology then it will fail miserably no matter how > much hype is in the presses > these days. Compiler technology is understandable today by most computer science graduates, as is XML. Yet does anyone remember trying to follow what the hell the guys at AT&T were up to with yacc (Yet Another Compiler-Compiler) and so forth. In fact teenage computer science students will be happily dealing in XML namespaces in two years time. And why should XML be understood by end users? Compiler technology isn't understood by most people who write C++ and Java programs. When my clients use Microsoft Office 97 they don't want to understand the file format. Why should that suddenly change when they use Office 2000? Sure, *I* want to understand the file format so I can do things with it, but that's my job. I also need to understand point-to-point tunnelling protocol, active directory services and domain name servers - unfortunately more than your 'average developer'. > > I disagree with this 'dumbing down' attitude that seems to > exist, where we > > must ensure that everyone can understand. If you want to > write a book > > making it clearer then do it - we'd all probably be > grateful - but the > > spec itself MUST be a formal document. > > I don't know what your definition of genius is, but mine is > simple "the ability to simplify > the complex". Quite plainly, if you can make reduce the > learning curve for our feeble human > minds to understand, it will take less time for people to > learn those concepts and they can > then take the time they have saved and work on new and > interesting tasks. Lovely. But I could equally say that the people who come up with the big stuff can obviously produce better things than those who have to be helped to understand what they've produced. They should therefore spend more time developing and less time explaining. Of course I wouldn't say that ... but by your criteria of efficiency I would be perfectly justified. In fact it has to be a bit of both - developing and educating - but my central point is that the standards writer is not responsible for take-up. It's great if they get involved - like some do on this list - but it's not an obligation. > If every generation > has to spend most of their adult life just learning > everything that the previous generation > created (but did not simplify) we will never have progress > because human beings will be very > old and very grey before they ever get the opportunity to > even think original thoughts. Seems to me that actually each generation simplifies what its ancestors produced. And then its the next generation who truly benefit from that. Nobody simplified (or complicated, depending on your standpoint (-:) Aristotle, Hegel, Marx, Einstein, Freud, and so on, in their own lifetimes. Yet today, most physics students can easily understand Einstein's theory of relativity. > > No! The job of the discussion *about* a spec is to find > agreement. Once > > that has been arrived at you need to codify that in a way that is > > unambiguous. It needs to be as formal as possible! > > If everyone told you to jump off a cliff would you do it? Very profound - but what are you on about? > Compromise on technical matters is > one of the worst things you can do. So why do you want understandable standards? Surely if you're against compromise you don't want *any* standards? > You think people read all of these books and take these > training courses because they want > to? They take them usually because the technologies they use > today are dictated to them. > Java's success is an example of people writing code in Java > because the "wanted to" because > Java took out 85% of the pain in C++. Nevertheless, I have > plenty of friends who still write > C++ code on new projects using MS Visual C++ because "old > grannies" at their institutions > think everyone should be doing things they way they did them > 10-15 years ago. They can't see > the benefits of Java being a simpler version of C++, because > they have not made an effort to > do so. An evangelical approach, I think. I know Java and C++, and many other languages. I think Java is great, and definitely one of the nicest languages yet to be devised, but still I don't use it. And that's after a great deal of evaluation. Essentially it promises what it can't deliver. Anyway, I think you have to allow people credit for coming to serious conclusions, even if, as may sometimes happen, they disagree with you. > If you write some spec without regard to simplicity > you are falling into a similar > model. "Namespaces in XML" is a pure example of simplicity > falling to the wayside in favor > of the attitude "well if I can understand it and no one else > can, then the whole world is > stupid". If someone wrote a book on namespaces and it was difficult to understand, then I would agree with you that it was a wasted effort. But a standard submitted to W3C? It *has* to be terse because it is being examined by numerous experts, who as far as I can see do *not* waste their time strutting around pretending the world is stupid. (Even if they did, I'd still thank them for XML, namespaces, XSL, XLink, HTML, HTTP, etc., etc.) I'm not quite sure why I'm getting so worked up about this, particularly given the lack of intervention by the standards writers themselves in their own defence! Maybe it's because over my 16 years in the business I've realised that the things that you learn yourself you *really* learn, and the people who are prepared to struggle to understand difficult concepts are the people you really want around you. We are lucky to be of the same generation as the innovators. *We* are getting the benefit of *their* developments, yet you are taking the attitude that you're doing *them* a favour by using their ideas! Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From amd0978 at acf3.nyu.edu Wed Feb 3 20:40:44 1999 From: amd0978 at acf3.nyu.edu (Adam M Donahue) Date: Mon Jun 7 17:08:30 2004 Subject: What is the purpose of ANY keyword ? In-Reply-To: <3.0.32.19990203141241.0073abb4@bl-mail2.corpeast.baynetworks.com> Message-ID: <Pine.OSF.3.95.990203153349.22493A-100000@acf3.nyu.edu> Not necessarily. <?xml version="1.0" ?> <!DOCTYPE example [ <!ELEMENT example ANY> <!ELEMENT foo (bar+)> <!ELEMENT bar (#PCDATA)> ]> <example> <foo></foo> <bar></bar> </example> Adam On Wed, 3 Feb 1999, Shekhar Kshirsagar wrote: > Hi, > > What's the purpose of ANY keyword in an otherwise strictly formed > syntax of XML ? > > Also is this right if I say that : > > "If a top level element of a DTD has a content specifier - > ANY, all the documents will be well-formed and VALID, > as far as they contain the elements from that DTD in any order." > > > Thanks & Regards, > Shekhar Kshirsagar > Nortel Networks. > > > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Wed Feb 3 21:25:11 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:31 2004 Subject: Another errata? References: <A26F84C9D8EDD111A102006097C4CD0D054943@SOHOS002> <36B89820.18073A33@infinet.com> Message-ID: <36B8BDD6.E6FB2DCA@manhattanproject.com> <philosophy> Perhaps the role of namespaces is fundamentally different in the "stream processing" paradigm than it is in "object processing" paradigm? Could this be the issue underlying the current debate? I don't know enough on the topic to say. However, I feel I can help by explaining my observations about the differences between the paradigms. 1. A tenant of object oriented programming is encapsulation, data hiding. For stream processing it is the opposite, data exposure. 2. Objects are modified or undergo state change by invoking methods. Where streams are re-written or translated by transformations. 3. Ideally, an object retains it's identity. The entire goal of a stream is to merge it's information with each and every observer; this is equivalent to identity loss. 4. An object has a 1-1 correspondence between its data and its code. A stream has a 1-M correspondence between its data and its code. Where the document is the data, and the code is the observer's transformation system. 5. Objects are finite, they have a boundry. Streams may be effectively infinite. For example, a pressure transducer sending water level measurements may operate continuously for years! Thus, you can store an entire object in memory, you may not want to store an entire stream in memory. 6. An object's interface describes a block of functionality provided. A stream's interface describes the information conthat it carries. 7. An object has one type or class which is assigned to the data, where a stream can be classified differently by each and every observer. This is especially clear if you read about Arcetectures. etc. Anyway, I'm not saying that one is better than the other, just that they are different and subtly interwoven. For instance, Scenerios is the study of object interactions as a stream of events. And SAX is a wonderful event-driven stream observer object. I feel that the key to the success of XML is to recognize that it is part of a different paradigm --XML complements existing technology. As such, it is important to scrutinize the application of object-oriented idioms to the new paradigm. </philosophy> Hope this helps, Clark Evans xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Wed Feb 3 21:35:45 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:31 2004 Subject: Another errata? References: <A26F84C9D8EDD111A102006097C4CD0D05494C@SOHOS002> Message-ID: <36B8C0CC.9DF4541C@infinet.com> Mark Birbeck wrote: > Tyler Baker wrote: > > Mark Birbeck wrote: > > > It therefore needs a lot of them, and this is achieved by having > > > effectively a 'namespace container' full of other (real) namespaces. > > > That namespace container is an 'XML namespace' - which for > > brevity, we > > > call a 'namespace'. (Hey, I'm not disagreeing with you that it's > > > tricky!!) > > > > Not only tricky, but unnecessarily tricky. > > I mean the words used are tricky, because namespace is used to mean > 'real namespace' and 'XML namespace' interchangeably. But the context > usually makes it clear which is implied. I'm *not* saying that using > namespaces is tricky. Actually it's pretty easy. OK sorry for the misunderstanding here. I guess I took your words out of context... > > XML should be a > > tool for real-world business > > solutions, not something that is more or less a puzzle some > > CS professor designs for his > > students. > > Yep. So how isn't it? Seems to be pretty real-world to me. I guess this is less a matter of arguing objective facts anymore and more of a value-judgement now among people who find that "Namespaces in XML" is easy to understand for them and those who either find it hard to understand, or else feel it will be hard for novices to understand. It breaks down to a personal judgement call here. Nevertheless, I think it is clear that "Namespaces in XML" has not been a concensus building process. > > Namespaces is a pain to deal with no matter how > > you look at it. Once you read a > > document into memory and no longer preserve the original > > prefixes (or rather the QName), when > > you write the document back out (which has possibly been > > mutated) where do you get these > > prefixes? > > Why would you not preserve the prefixes? This is a bizarre point. If I > delete the contents of my hard-drive shouldn't my word processor still > open my letters? Of course not. As it happens, you couldn't delete the > prefix anyway, because they are part of the names of the elements and > attributes. Think of it like an XML 1.0 document - would you delete all > characters before a colon in all names? Of course you wouldn't. > Effectively you have an XML 1.0 document that conforms to the namespace > spec if all 'Names' actually conform to 'QNames' (or in some cases > NCNames). All that happens with namespaces is that you have an > additional way of viewing your nodes over and above that defined in XML > 1.0, and that is by namespace. Most DOM implementations that have namespaces support (Oracle's and SUN's are two that come to mind) do preserve the prefix information inherently (I am not sure whether QNames or expanded names are returned from calls to getNodeName() though). Some people here were arguing that in SAX 1.1 or whatever, that this prefix information should be lost because only the "namespace" should be relevant to the application. A lot of people seemed to agree with this point (I didn't as I favored parsers showing the application a Name type which could resolve prefixes and namespaces) that all you need to do is present the expanded name to the application. I am really confused here more than ever as to what on earth "Namespaces in XML" and those that support it are trying to accomplish with it more than ever. > > If people cannot understand what you write because the ideas > > are too complex for the average > > developer, then there is no real point in having a standard > > If all you ever develop is things that can be understood by the average > developer then this discussion group would not exist since XML would not > exist, and nor would space travel, electricity, and just about every > philosophy ever invented. There is a stark difference between research oriented development and business oriented development. In universities you will find that most CS professors are versed in all kinds of programming languages, some of them in-house ones they help to develop. They get paid to teach and do abstract research for the most part (sometimes they get grants from industry but even those grants deal a lot with abstract research). Whether anyone actually understands what they are doing is not relevant until they stumble onto something useful. Once they stumble onto something useful, then they may patent it or publish it or whatever they feel is appropriate. In the end, industry will take these ideas and try and apply this research to real-world problems. If things are too hot to handle and cost too much to develop, maintain, and support, any products resulting from this research will likely have a very small market. Is this what people want XML to be suited for: a very small market. I don't consider XML to be something that should be rocket science as its intended audience should be average developers. The number one goal I feel should be to have a simple markup language that more complex solutions can be built upon. "Namespaces in XML" makes building complex solutions harder because supporting all of the weird semantics of it is not easy to handle. > > > But if we're really honest here, if we were in a discussion group on > > > compiler technology ten years ago, you would not have such > > a wide range > > > of people discussing these issues, and narrower range of > > > misinterpretation (I'm not saying everyone understood > > everything then > > > either). That's not to say we shouldn't have more people > > involved, but I > > > > That is because compiler technology is a totally different > > field (and much more complex field) > > than XML will ever be. If XML is to be used by only > > developers and end-users who understand > > compiler technology then it will fail miserably no matter how > > much hype is in the presses > > these days. > > Compiler technology is understandable today by most computer science > graduates, as is XML. Yet does anyone remember trying to follow what the > hell the guys at AT&T were up to with yacc (Yet Another > Compiler-Compiler) and so forth. In fact teenage computer science > students will be happily dealing in XML namespaces in two years time. So should XML only be used by CS graduates? I sure hope not. Only about 10% or less of the entire IS industry is comprised of CS graduates. Should XML be relegated to geekdom or should it be something to bring the entire web together? I favor the latter as the web is made up of millions and millions of people, 99% or less of which have no programming experience, yet want to have control over the content they create. If the W3C screws up XML (like it seems to be doing with "Namespaces in XML", the entire web will be stuck in HTML land for the next 5 years. > And why should XML be understood by end users? Compiler technology isn't > understood by most people who write C++ and Java programs. When my > clients use Microsoft Office 97 they don't want to understand the file > format. Why should that suddenly change when they use Office 2000? Sure, > *I* want to understand the file format so I can do things with it, but > that's my job. I also need to understand point-to-point tunnelling > protocol, active directory services and domain name servers - > unfortunately more than your 'average developer'. Again XML is not rocket science. You can use it to do some very basic things or else you can get creative with it. Is the intention here to make XML as non-understandable by the masses as possible. This logic suggests this even though I am pretty sure that is not your intention here. > > > I disagree with this 'dumbing down' attitude that seems to > > exist, where we > > > must ensure that everyone can understand. If you want to > > write a book > > > making it clearer then do it - we'd all probably be > > grateful - but the > > > spec itself MUST be a formal document. > > > > I don't know what your definition of genius is, but mine is > > simple "the ability to simplify > > the complex". Quite plainly, if you can make reduce the > > learning curve for our feeble human > > minds to understand, it will take less time for people to > > learn those concepts and they can > > then take the time they have saved and work on new and > > interesting tasks. > > Lovely. But I could equally say that the people who come up with the big > stuff can obviously produce better things than those who have to be > helped to understand what they've produced. They should therefore spend > more time developing and less time explaining. Of course I wouldn't say > that ... but by your criteria of efficiency I would be perfectly > justified. In fact it has to be a bit of both - developing and educating > - but my central point is that the standards writer is not responsible > for take-up. It's great if they get involved - like some do on this list > - but it's not an obligation. Simpler specifications allow for less need for educating (because they are more direct and to the point) and more time others can devote to more creative endeavours. And on the flip-side if you have a great tool for doing all of the trivial tasks of the web, why screw it up? > > If every generation > > has to spend most of their adult life just learning > > everything that the previous generation > > created (but did not simplify) we will never have progress > > because human beings will be very > > old and very grey before they ever get the opportunity to > > even think original thoughts. > > Seems to me that actually each generation simplifies what its ancestors > produced. And then its the next generation who truly benefit from that. > Nobody simplified (or complicated, depending on your standpoint (-:) > Aristotle, Hegel, Marx, Einstein, Freud, and so on, in their own > lifetimes. Yet today, most physics students can easily understand > Einstein's theory of relativity. Understanding through dictation is far different than understanding something and being able to build on it. Some major advances in physics have come about since Einstein's death, but only a very small number of people on earth can do anything useful in terms of building upon Einstein's work. The last thing Einstein was working on was a simple formula for explaining in simple terms how the universe works. He never got that far, but even Einstein realized that > > > No! The job of the discussion *about* a spec is to find > > agreement. Once > > > that has been arrived at you need to codify that in a way that is > > > unambiguous. It needs to be as formal as possible! > > > > If everyone told you to jump off a cliff would you do it? > > Very profound - but what are you on about? > > > Compromise on technical matters is > > one of the worst things you can do. > > So why do you want understandable standards? Surely if you're against > compromise you don't want *any* standards? No I am saying that you either have concensus on what works the best and not try and just create something that makes everyone "feel good" but does not solve the ultimate problem. If you publish something called a standard, but it does not do the job, then you basically have nothing but a glorified piece of paper with a bunch of names on it. > > You think people read all of these books and take these > > training courses because they want > > to? They take them usually because the technologies they use > > today are dictated to them. > > Java's success is an example of people writing code in Java > > because the "wanted to" because > > Java took out 85% of the pain in C++. Nevertheless, I have > > plenty of friends who still write > > C++ code on new projects using MS Visual C++ because "old > > grannies" at their institutions > > think everyone should be doing things they way they did them > > 10-15 years ago. They can't see > > the benefits of Java being a simpler version of C++, because > > they have not made an effort to > > do so. > > An evangelical approach, I think. I know Java and C++, and many other > languages. I think Java is great, and definitely one of the nicest > languages yet to be devised, but still I don't use it. And that's after > a great deal of evaluation. Essentially it promises what it can't > deliver. Anyway, I think you have to allow people credit for coming to > serious conclusions, even if, as may sometimes happen, they disagree > with you. It depends on your problem domain I suppose. Millions use Visual Basic, even though some would hardly consider it a programming language. Yah Java does not do everything it promises, but I feel it does 95% of the stuff most developers need for most applications. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Wed Feb 3 22:08:25 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:08:31 2004 Subject: 'Strong' status for XML Message-ID: <199902032208.RAA29780@hesketh.net> I just visited the top W3C page. 'XML' is listed under the Architecture section, as usual, but it's bolded. I checked out the source, and it gets <strong> while nothing else does. Is this a promotion? An accident? It's kind of funny, anyway. Simon St.Laurent XML: A Primer / Building XML Applications (March) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Wed Feb 3 22:38:12 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:31 2004 Subject: Fw: Namespaces Message-ID: <A26F84C9D8EDD111A102006097C4CD0D05494F@SOHOS002> James, hope you don't mind me copying this to the group. I'd like to know what the other James thinks. James Anderson wrote: > James Clark's notation is readily extended to describe the same > things as you > desire. See my note in this thread. I suggest that it is to > be preferred to > the A.3 notation as it is compacter - it can, in a sense, be > expressed 'in > line'. > For reasons which I do not fathom, (see his reply to same) he > explicitly resists the extension and disregards the concepts > established in > appendix a. I don't follow you James. Where does he disregard appendix A? On the extension, I think he is right to resist it - although to be fair it is only a shorthand, not a new standard! If you think about what he is representing with his current stuff, he has devised a simple way of giving us clues as to how some post-parser software might be able to use the data. Given: <n1:good a="1" /> and mapping it via James's syntax to: <{http://www.w3c.org}good a="1" /> the post-parser can say that 'good' is part of the http://www.w3c.org namespace. Equally, given: <good n1:a="1" /> and mapping it to: <good {http://www.w3c.org}a="1" /> James has given us a further insight into the mind of the post-parser; it now knows that 'a' is part of the http://www.w3c.org namespace too. In other words, James's syntax is so far an accurate representation of what the post-parser knows. However, if we now 'extend' his syntax to map: <n1:good a="1" /> to: <{http://www.w3c.org}good {{http://www.w3c.org}good}a="1" /> as you seem to want (or at least Oren did, and I guess you are agreeing with him although I can't find your original message), we've now expressed something that is no longer consistent with what the post-parser knows. We are either saying that 'a' is a member of the namespace {http://www.w3c.org}good (if we treat the outer curly braces as representing a namespace name) but that is not a valid uri. Or we are saying that 'a' is a member of the namespace 'good' which is a member of the namespace http://www/w3c.org, which is (sort of) closer to what is going on in the intern representation of the XML namespace (according to appendix A) but is meaningless at this level, because James is talking about global namespaces, and so we need a uri. James did spell that out when he introduced his shorthand. (Even if we allowed this, I would have thought that the syntax {http://www.w3c.org}{good}a would be preferable.) In fact what the post-processor knows about the attribute 'a' is that it has no uri prefix and is part of the element 'good'. That *can be* expressed with James's syntax, provided that you keep 'a' with 'good' (as I said before, it is like an 'exploded' XML document): <{http://www.w3c.org}good a="1" /> but James's syntax cannot be used to express anything about 'a' outside of this, since you can't have: {{http://www.w3c.org}good}a as we've already said, and: a has lost all context. However, the extended attribute/element syntax I used - from Appendix A - at least *can* express something about 'a', without having to be in the original XML document: <ExpAName name='a' eltype="good" elns="http://www.w3.org" /> Personally, I prefer the latter method anyway, because it makes it clear that we are looking at a different view on the data. The XML 1.0 or parser 'view' of the data looks at elements and attributes, their values and their relationships to each other. The 'post-parser' view is of the same information but from the perspective of which namespace an element or attribute belongs to. It is possible that an element or attribute that has a direct relationship with an element in the first view has no relationship to that same element in the second view. Unfortunately, James's syntax mixes up these two 'views' of the data, by trying to superimpose the post-parser's view (of namespaces) onto the parser's view (of data). Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at weblogic.com Thu Feb 4 00:54:01 1999 From: peter at weblogic.com (Peter Seibel) Date: Mon Jun 7 17:08:31 2004 Subject: XML parsing techniques Message-ID: <19990204010149913.AAA280@ashbury.weblogic.com@lawton> Is there anyone out there who can characterize the problems/challenges/best practices when it comes to parsing XML? Looking (briefly) at the source of a couple parsers (Lark, Microsoft's, and XP) it looks like the parsers are some flavor of hand written recursive descent. (Well, Lark has that funky hand-coded DFA thing which I didn't really spend much time trying to grok -- that's not really recursive descent as I understand things.) Is there a reason no one seems to be using parser generators (like ANTLR or JavaCC)? This may be more a question about the limitations of those tools which were designed for parsing things that look a lot more like Algol than XML does. -Peter P.S. Are there any parsers out there that actually return DOM objects? -- Peter Seibel Perl/Java/English Hacker peter@weblogic.com Is Windows98 Y2K compliant? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Thu Feb 4 01:15:32 1999 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:08:31 2004 Subject: Since we're talking about databases... References: <000701be4f05$b9282800$5118a8c0@kuantech1.quokka.com> Message-ID: <36B8F407.5008@hiwaay.net> Jeffrey E. Sussna wrote: > > Please please please let this list not degenerate into a philosophical debate about the merits of various approaches to database design. I have been following the relational vs. object vs. object-relational debate for some ten years now, and the mere mention of Dr. Stonebraker's name still makes me shiver. :-) The papers referenced are not about the philosophy of database design, but the performance of different database implementations, particularly that affected by implementing object handling in the kernel vs the use of middleware. Len Bullard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Thu Feb 4 02:50:59 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:08:31 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) In-Reply-To: <36B819D3.B65C07F3@prescod.net> Message-ID: <001801be4fe8$90127bd0$d3228018@jabr.ne.mediaone.net> Paul Prescod wrote: > If I am using a DOM interface, > it frankly > > doesn't matter what the serialization format is, I am > interacting directly > > with data through an interace. > > You are interacting with data through an interface that was designed to > provide access to the abstract data model of a *serialization*. In other > words you are treating your database as if it were the result of parsing > an XML document. You've put an "elements and attributes" interface on data > that is much more complex than elements and attributes. If it were not > more complex then elements and attributes we would not need stuff like > XLink, HyTime, Namespaces and RDF to even *attempt* (and fail) to > represent it. Really !? and I thought I was interacting with an interface designed to represent the abstract data model of XML. How could we have drawn such disparate conclusions. I *don't* care what the intention of creating the DOM interface was BTW just as I don't care what the intention of creating the C language was. I can use tools, interfaces as I see fit. Does the DOM represent XML data in full fidelity? If not, should it be extended? I'm not sure that the data is so much more complex than can be represented by the DOM interfaces. Perhaps I have a different view than you do. The way I see it, I can represent pretty much any piece of data using XML as I can with say an s-expr. The issue is only one of efficiency (as it has been with s-exprs). So, as I see it, if I put an efficient implementation behind an interface, it might work in ways you haven't imagined. > The XML data model, whether a grove or a DOM is the "Forrest > Gump" of representations for your data. Sending a dumbed-down message by > Forrest Gump is good: he will relate it faithfully. Installing him as the > only conduit for information is bad. You'll have to dumb down too much > information and spend too much energy re-assembling it on the other side. Really? And what is a more intelligent abstract data model? I'm not feeling too limited yet but perhaps I'm doing simplistic stuff. SGML has been around for years, but has never generated too much excitement. In contrast, XML. Sometimes simple is very very good. Sometimes, the simpler things get the more powerful they get. > > > I wouldn't suggest that the DOM replace ODBC, yet I'm > quite sure that those > > experienced using a variety of systems with disparate data > types and data > > usages will appreciate that certain types of data are best > expressed in tree > > format. Such data scenario's might best be interfaced with via the DOM. > > You just need an API for "tree formats". Just ask your DBMS vendor to > provide some tree-structured API. It doesn't matter if that API is the DOM > because making it the DOM does *not buy you anything* as a programmer. > From a programming point of view there is no benefit to working with a > consistent API where everything is dumbed-down to a textual model. You > might as well dumb everything down to an "object model." (see below) There exist dozens of proprietary database interfaces. Most people are willing to accept a common denominator which is mostly acceptable because the benefits gained by employing a widely accepted standard outweigh the benefits of a locally optimized proprietary interface. On the other hand if I want to create an optimized high performance system I already have lots of options. Most of these do use object oriented programming, not that I can't do it all in assembler, but rather because of the benefits of maintainability and the reasonable performance of modern compilers. The benefits of object oriented programming are orthogonal to object databases. > > If you buy this, then guess what the hype will be in three years: "These > new fangled data bases have this really cool feature, dude. You call it > with a SQL9X query and it can return like OBJECTS!. Everything in the > world can be expressed as objects! Lists of objects. Lists of objects. > Trees of objects. Directed graphs of objects. Arbitary graphs of objects. > It like unifies everything as objects. It's Zen, man. They call it 'JDBC' > and its totally wicked." What are you trying to say here? Are you criticizing objects? > > The *only benefit* of unifying things as DOMs is reusing software that was > originally supposed to work with XML (i.e. XSL implementations). If you > are writing new software it makes NO SENSE to do it through a DOM > interface unless your data source is *XML*. Suppose I want to process the data using XSL? Is this conceivably an acceptable reason to use a DOM interface (assuming I don't actually want to convert my database to serialized XML itself). > > Otherwise, you should just define a "tree node" interface and have your > various objects implement it. You will get all of the the benefits of the > DOM with none of the costs (i.e. how the hell do you represent complex > properties of objects???). If you want some good hints about what a "tree > node" interface looks like, take a look at the grove abstraction. > ... > > Second, Even *XSL* is not best served by a DOM representation. James Clark > wrote an xsl-list article about that but I can't find it now. Remember > that the DOM was invented as an extension of "DHTML." It's only half > "there." Certainly XSL is best served by a DOM representation if the data is presented via a DOM interface. The other option is to serialize everything. This makes no sense unless the DOM implemention is sub-optimal. > > But if I grant that some well-thought-through API for XSL trees could > exist (i.e. Jade's grove API) then I would propose that it only be used as > an optimization in a system where it would otherwise make sense to pass > around serializations of text documents. i.e. the DOM is okay for skipping > a layer of message passing. It is not okay as a "universal API" for "all > of the data in an organization." I never said this. There is a difference between saying that something is useful (i.e. placing a DOM interface on a database) and saying that this is the only way things ought to be done. > > To bastardize JWZ: "Sometimes people have a hard data unification problem. > One part of their organization speaks a very different language (at the > data model and object model level) than another part. They might think 'I > can unify these with XML or the DOM.' Now they have two problems." > > There second problem is that they didn't understand the really hard > problem in their organization. Data model unification is *easy* (cast to > java.lang.object or w3c.dom.node). Data model *rationalization* is very > difficult. And I don't think that there are many shortcuts. > > > Are the DOM interfaces the best for all situations, > clearly not. However if > > a significant percentage of people can agree to use them a significant > > percentage of the time, this is a big win. > > That's not going to happen. The DOM will NOT be a core tool for that > majority of OO programmers this year, next year, or ever. Programmers will > try it and increasingly find that if they are not doing XSL styling for > the Web or print that the DOM is not a core tool. "Old-fashioned" OO can > provide the same benefits. What I am saying is that if these interfaces are accepted as a standard mechanism for representing XML data, that this in and of itself is a big win. XML is not close to replacing OO in fact it seems to be struggling with OO concepts itself. It is a big mistake to assume that XSL is only to be used for 'styling for the Web or print'. DSSSL as we know is based upon or employs Scheme. Scheme is a full fledged programming language, a dialect of LISP. XSL does not have the Scheme counterpart to DSSSL, rather it is itself its own programming language (albeit currently simplistic). XSL is the first XML programming language and employs XML in a directly analagous way that LISP employs s-exprs. If I feel that I can express any data structure in XML that I can express as an s-expr, I see no good reason that, with proper extensions, XSL would not be able to handle tasks that Scheme, or Lisp have been applied toward. I believe XSL has a great future. If NN and IE contained native Lisp interpreters this might not be at all a big deal, what I find interesting is that a high percentage of desktops will, in the near future, gain the ability to run a data based transformation language. Perhaps the first use is an engine to display XML as HTML. I've developed a medical records system which employs XML and XSL to do relatively simple processing on the client (and have used the DOM interface for this BTW). As someone who has used many different programming languages and systems, I have found surprised at what can be done with the combination of XML, the DOM, XSL and JavaScript. The bottom line here is easy of development and deployment. This is my initial experience. Use your imagination. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Thu Feb 4 03:49:20 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:08:31 2004 Subject: Restricted Namespaces for XML Message-ID: <000d01be4ff1$286876c0$2ee044c6@arcot-main> This is not a proposal but a testing of the water to see if there is a wide enough need for a 'restricted' (some might call it constricted) version of the "Namespaces for XML" spec (lets call it RNX for now). Such a spec might dictate that all namespace declarations be at the root element (XML fragments are problematic but...). This restriction has the side effect of not allowing duplicate prefixes. Comments? Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Thu Feb 4 03:51:26 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:08:31 2004 Subject: What is a good database for very large collections? In-Reply-To: <no.id>; from Borden, Jonathan on Mon, Feb 01, 1999 at 12:33:53PM -0500 Message-ID: <19990204144908.C20720@io.mds.rmit.edu.au> On Mon, Feb 01, 1999 at 12:33:53PM -0500, Borden, Jonathan wrote: > > > > Can I try to shift it back to a vital question asked earlier, but not > > answered? > > > > What is a good database for XML? SIM (http://www.simdb.com/sim_2.1/ > > The criteria are: > > * over 20, 000, 000 document fragments, each less than 256 > > characters, each with some flat metadata, able to be incrementally > > reloaded onto the live system > > * about simultaneous 30 users accessing about 10 fragments a minute > > each, grouped together (along with other dynamic data) and transformed, > > with a high need for immediate response We can load about 200 MB per hour while live (actually I think we can load 400-500 MB/hr but we claim 200 MB to add a safety factor). We handle small documents quite well through DTD caching techniques (we also plan to include expat in the near future for unvalidated XML. We do currently support unvalidated XML, but through SP, which is not as fast as we'd like). Queries are fast (we queried "to be or not to be" across 55 GB in 74 seconds on a 2x336 MHz UltraSPARC with 1 GB RAM--note that this was a word position query using several stop words). > How are the fragments selected? By query? If you can easily > represent the 20M fragments in tabular form, and if you can easily > represent the queries in SQL then a relational db is the way to go. > this is not a particularly large, nor high-volume application for > RDBMS. And if you can't represent them in tabular form, try SIM. > Ought you store the 20m fragments each in its own file ... probably > not (a big waste). Ought you employ an ODBMS? not unless SQL > wouldn't work well (you could always load it into say Oracle/SQL > Server/DB2 etc vs. ODI/Poet etc and test it out). My expectation > would be that if you need to run queries, the RDB will win. For content queries (e.g. summary CONTAINS "stock option*") SIM will easily outperform an RDBMS. Customers have chosen our product above RDBMS's for this very reason. > > * constant data-mining tools using various adhoc AI and linguitic > > retrieval software augmenting the metadata in the background. We support stored queries and scheduled queries with filters to exclude previously returned records. I'm not sure if this meets the above requirement. To say there are no scalable solutions (as someone did recently on xml-dev) is simply false. There may be no scalable solutions that do everything you want--and I'm certainly not touting SIM as the be-all and end-all (we have yet to support XQL, full path indexing, transactions, etc. all are pending with varying levels of priority)--but there are products available right now that scale and solve people's problems. SIM has been used in law (http://www.thelaw.tas.gov.au is the world's first legislation to officially go online), taxation (http://www.ato.gov.au/general/advanced/adv.htm), other government (libraries, NSA--no URL, sorry :-), aviation (Boeing), etc. Moreover, our customers don't go away dissatisfied. We are quite proud of the fact that every SIM site is a reference site. We are also pleased that in some instances, project managers have been promoted as a result of using SIM! Cheers, Marcelo Cantos SIM developer -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mda at discerning.com Thu Feb 4 04:06:21 1999 From: mda at discerning.com (Mark D. Anderson) Date: Mon Jun 7 17:08:31 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) Message-ID: <03ae01be4ff3$772a4a20$0200a8c0@mdaxke.mediacity.com> Very interesting discussion. Let me try to "echo back" what I'm hearing. On one side, there are those who think of XML primarily as a serialized representation of The Real Thing. They decry notions of programming via the DOM for two reasons: - (eliot kimber) The XML data model that you would use programmatically may not actually match the the data model of The Real Thing. - (paul prescod) Using the DOM or other XML-level manipulation is a level of indirection from manipulating The Real Thing, and is hence more obscure and inefficient. Then on the other side (not that I'm trying to make this a battle), there are those who say: um, ok, but trust me, there are cases where I really don't want to deal with The Real Thing because in my application there are actually a lot of heterogenous Real Things. Or maybe just one, but it is unbelievably painful to deal with (say, IBM CICS). Or maybe some demented person is keeping the Real Thing itself in xml. And I would like to have a "standard" way to address that model -- meaning some way to read and possibly write it from a programming language. (Does this include simon sl, tim b, david m?) Then on the other side (I'm just reflecting how these discussions go :)) there are those who say well, ok, but it is futile to try to find such a mapping that works on all data models (including both tabular style and document style) from our many varied programming languages. And to the extent that it is possible, it already exists (list your favorite set of perl grove utilities here). Just to add my own $.02, I feel that it *is* possible to arrive at a common API, with automatic transformed equivalence among programming languages. It isn't the DOM, or just the DOM, because the DOM forces programming based on things like "getAttribute", not "employees.next().name". Sort of like doing lisp programming entirely with "car" and "cdr". However, I think that it can't be done until things like XLink and DCD-the-next-generation are squared away, and it will be mostly useful to scripted languages that can do such dynamic binding -- though i expect that a gen utility could make .h files to allow similar programming from C/C++, in much the same way that some object-relational gateway products work. -mda xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Thu Feb 4 04:13:25 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:31 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) References: <001801be4fe8$90127bd0$d3228018@jabr.ne.mediaone.net> Message-ID: <36B91D93.83627AFE@manhattanproject.com> Jonathan Borden wrote: > What are you trying to say here? > Are you criticizing objects? You can't always treat a stream as object. If you do, you loose significant power. > Suppose I want to process the data using XSL? Is this conceivably an > acceptable reason to use a DOM interface (assuming I don't actually want to > convert my database to serialized XML itself). I would see this as the last thing you would want to do. However, I don't have XSL experience, so someone with real-world experience would be a better spokesperson. DOM requires the entire stream be read before the the document object is returned and processing can begin. Not only does this chew significant memory for very large streams, but it causes significant delay before output could be generated. In the worst case, it turns a perfectly simple problem into an "impossible" one where the memory requirements and time delay make the solution useless. If the stream is only going to be "filtered", why read the entire thing into memory before starting the transformation process (in this case filtering)? > Certainly XSL is best served by a DOM representation if > the data is presented via a DOM interface. I would speculate to the contrary, and would think that driving XSL with SAX would be a far better choice. > The other option is to serialize everything. No. The option is to move to Event based processing of streams. You can then model with "event objects" > This makes no sense unless the DOM implemention is sub-optimal. No. It's a computational complexity issue. For a decent size stream, with a transformation that can be done in a single-pass (XML->HTML), no DOM implementation will even come close to an implementation using SAX. Crunch some numbers. If your still not convinced, read Ableson, Structure and Interpretation of Computer Programs, ISBN 0-07-000484-6, Section 3.5.1, page 317. There he talks about: "severe inefficiency with respect to both time and space". Best, Clark Evans xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Thu Feb 4 04:39:29 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:08:32 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) In-Reply-To: <36B91D93.83627AFE@manhattanproject.com> Message-ID: <001f01be4ff7$bbc30830$d3228018@jabr.ne.mediaone.net> Clark Evans wrote: > > DOM requires the entire stream be read before the > the document object is returned and processing can begin. > Not only does this chew significant memory for very large > streams, but it causes significant delay before output > could be generated. In the worst case, it turns a > perfectly simple problem into an "impossible" one > where the memory requirements and time delay make > the solution useless. You are missing the point. Since the data is already deserialized, there is no delay due to processing. The idea is that the data has already been entered into a database which implements a DOM interface (the database is free to implement other interfaces as well). I never claimed that the data was entered as XML or that a serialized XML document ever existed. Your entire argument assumes that there is a need to parse XML into an in-memory DOM during the processing phase. This is incorrect. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Thu Feb 4 04:54:43 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:32 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) References: <001801be4fe8$90127bd0$d3228018@jabr.ne.mediaone.net> Message-ID: <36B92536.D5BF3AA0@prescod.net> "Borden, Jonathan" wrote: > > > You are interacting with data through an interface that was designed to > > provide access to the abstract data model of a *serialization*. > Really !? and I thought I was interacting with an interface designed to > represent the abstract data model of XML. Right. XML is a serialization. The DOM is an abstraction of a serialization. Not an abstraction over your data. If your "problem" is representing debit card bank accounts the proper abstraction over that is "bank account" or "currency account". The *wrong* abstraction is "elements and attributes." > How could we have drawn such > disparate conclusions. I *don't* care what the intention of creating the DOM > interface was BTW just as I don't care what the intention of creating the C > language was. I can use tools, interfaces as I see fit. Does the DOM > represent XML data in full fidelity? No. It isn't meant to. > If not, should it be extended? No. It isn't meant to. > I'm not sure that the data is so much more complex than can be represented > by the DOM interfaces. XML is very simple. "All the world's data" is very complex. That's why we need XLink, RDF, HyTime and a bunch of other stuff. If your API to "your data" is simply the DOM then you are temporarily hiding its complexity behind a simple layer that can *NOT* express its "linkedness", its complex class relationships, is geographical 2D-ness etc. I don't know your data. I don't know what makes it complex but if your job is interesting it IS complex and the DOM does not help you to manage that complexity. We have 20 years of software engineering that DOES help us to manage that complexity and its most recurring message is "abstraction, abstraction, abstraction." Dumbing data down to a DOM is the opposite. "If we make the trees large enough then a forest and an orchard will look the same." Too bad you still can't see the forest for the trees. > Perhaps I have a different view than you do. The way > I see it, I can represent pretty much any piece of data using XML as I can > with say an s-expr. I've never claimed otherwise. The question is whether XML presents a good *API* to that data. > Really? And what is a more intelligent abstract data model? Object orientation. > > If you buy this, then guess what the hype will be in three years: "These > > new fangled data bases have this really cool feature, dude. You call it > > with a SQL9X query and it can return like OBJECTS!. Everything in the > > world can be expressed as objects! Lists of objects. Lists of objects. > > Trees of objects. Directed graphs of objects. Arbitary graphs of objects. > > It like unifies everything as objects. It's Zen, man. They call it 'JDBC' > > and its totally wicked." > > What are you trying to say here? Are you criticizing objects? No. I am saying that the only "API" that can unify all of the complexity of all of the data in an enterprise is "object orientation" or perhaps something even more powerful. Dumbing everything down into elements and attributes is not a step forward, but a step backward. The DOM itself is evidence of this. Note that the DOM's creators did not make a CSS DOM by representing CSS in terms of elements and attributes. They made a whole new set of methods and properties. They used *object orientation* not XML. The CSS DOM has absolutely nothing to do with XML *as it should not*. > Certainly XSL is best served by a DOM representation if the data is > presented via a DOM interface. The other option is to serialize everything. > This makes no sense unless the DOM implemention is sub-optimal. No. The other option is to make an API that takes into account the needs of XSL implementation. > What I am saying is that if these interfaces are accepted as a standard > mechanism for representing XML data, that this in and of itself is a big > win. XML is not close to replacing OO in fact it seems to be struggling with > OO concepts itself. The DOM is pretty good at representing XML data. If that is all you want to say, fine. The point of this thread is that there are people who think that the DOM is great at representing data of all sorts. You should just throw a DOM interface over your database and all of your interoperability problems will go away. > It is a big mistake to assume that XSL is only to be used for 'styling for > the Web or print'. DSSSL as we know is based upon or employs Scheme. Scheme > is a full fledged programming language, a dialect of LISP. XSL does not have > the Scheme counterpart to DSSSL, rather it is itself its own programming > language (albeit currently simplistic). XSL is not a programming language according to the Turing/Church definition. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Feb 4 05:35:40 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:32 2004 Subject: Restricted Namespaces for XML References: <000d01be4ff1$286876c0$2ee044c6@arcot-main> Message-ID: <36B93164.5956509B@infinet.com> Don Park wrote: > This is not a proposal but a testing of the water to see if there is a wide > enough need for a 'restricted' (some might call it constricted) version of > the "Namespaces for XML" spec (lets call it RNX for now). > > Such a spec might dictate that all namespace declarations be at the root > element (XML fragments are problematic but...). This restriction has the > side effect of not allowing duplicate prefixes. > > Comments? Of course this is very much like the old PI approach will may have been brutally simple, but this I think was a good thing. There are many other ways of assuring that all Names in a document are unique that make much more sense than "Namespaces in XML", the simplest being that element types have some unique identifier prepended to your general name type. Perhaps this adds a bit to the total size of XML documents. For example if I wanted to represent some element that mapped to a java.awt.Rectangle class I might have something like: <java.awt.Rectangle x="0" y="0" width ="100" height="100"/> In this case the convention is package name + '.' + class name. With a DTD you are already there. It is really that simple. If you want to use a domain name plus some path, you could have something like: <www.amazon.com:books:history title="The History Of Europe"/> where you merely replace path separators with ':'. Of course namespaces (the old simple PI based approach) I think would help a lot in search engines as they would only have to scan the prolog of the document to see if any well-known namespace elements are used in the document. Perhaps we can take a vote on this list to see whether people like the old PI based namespaces proposal better than how things are currently defined. Nevertheless I really like this RNX idea you are proposing. I am not against namespaces per se (most XML applications I don't think will ever need any namespaces mechanism though), but I think that the final decision makers of "Namespaces in XML" somehow neglected the myriad of already noted problems with "Namespaces in XML" as things now stand. If they will ignore the total lack of concensus on this issue, then it is likely their efforts in the future will be ignored. Right now I have to spend a lot of otherwise unnecessary time dealing with implementing a very complex, somewhat inefficient hack to implement namespaces in an XSL Processor I have been working on so that I can have support for the DOM has well. The alternative is no DOM source tree support or else building a proprietary source tree structure as in the case with XT. Koala and LotusXSL have support for namespaces, but they have chosen an implementation path I chose to ignore a long time ago because it causes major performance problems. At this stage in the game with XSL, I doubt that Jeremy Calles or Scott Boag care much about performance. Most XSL users I feel now are very early adopters and are just getting familiar with stylesheets and the XSL processing model, so performance is not such a big issue to get all excited about at the moment. Right now XT is the only released XSL processor that has decent performance IMVHO, however since XT has no DOM support at all, I am not sure how useful it is in most production environments at this point (XSL is still in its infancy so a lot of things could change in the future that may cause all XSL Processor developers to totally change their design altogether). Since XSL still has 6 months before it will probably become a recommendation, none of these namespaces issues I feel matter too much right now, but if they are not addressed sometime soon, they will cause major problems down the road. Of course the DOM could change to provide native support for namespaces, but Level 1 is a recommendation now (i.e. immutable). My faith in the W3C right now is at an all-time low, so my guess is that the DOM will not change, "Namespaces in XML" will not change, and XSL plus the DOM will be a pain to deal with because of needing to support "Namespaces in XML" without incurring too much overhead. So in the end XSL users suffer by needing to throw extra hardware to use XSL to operate their web sites and that is a shame. A member on this list (I forget who) was talking about why they use expat and not XSL right now and their number one concern was performance (they basically said that they were getting 120X performance gains with the expat approach which no matter how you cut it is significant). I am not sure how much overhead namespaces support will add right now to my particular XSL implementation (it is hard to say because I am using a trick/hack that is hard to pin down in terms of O notation), but I would not even have to deal with this if the old PI based approach was used. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Thu Feb 4 05:46:04 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:08:32 2004 Subject: Fw: Namespaces References: <A26F84C9D8EDD111A102006097C4CD0D05494F@SOHOS002> Message-ID: <36B9302C.390A6F70@jclark.com> Mark Birbeck wrote: > The XML 1.0 or > parser 'view' of the data looks at elements and attributes, their values > and their relationships to each other. The 'post-parser' view is of the > same information but from the perspective of which namespace an element > or attribute belongs to. I would say that the difference between the view that the parser gives you and the view that the post-parser (namespace processor) gives you is simply that, in the post-parser view, element and attribute names can be qualified by a URI. There is no need to make it any more complex than that. All this stuff about namespaces is just unnecessary, confusing complexity that invites the over-analysis that is so prevalent in this forum. If the spec was called something like "XML Universal Names" and never mentioned the word "namespace" and didn't include Appendix A (which thankfully is not normative), absolutely no functionality would have been lost and I think there would have been far less confusion. I've updated my note http://www.jclark.com/xml/xmlns.htm to try to address some of the comments I've received. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Feb 4 06:00:15 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:32 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) References: <001801be4fe8$90127bd0$d3228018@jabr.ne.mediaone.net> <36B91D93.83627AFE@manhattanproject.com> Message-ID: <36B936DF.495F9CAE@infinet.com> Clark Evans wrote: > Jonathan Borden wrote: > > What are you trying to say here? > > Are you criticizing objects? > > You can't always treat a stream as object. If you do, you > loose significant power. > > > Suppose I want to process the data using XSL? Is this conceivably an > > acceptable reason to use a DOM interface (assuming I don't actually want to > > convert my database to serialized XML itself). > > I would see this as the last thing you would want to do. > However, I don't have XSL experience, so someone > with real-world experience would be a better spokesperson. > > DOM requires the entire stream be read before the > the document object is returned and processing can begin. > Not only does this chew significant memory for very large > streams, but it causes significant delay before output > could be generated. In the worst case, it turns a > perfectly simple problem into an "impossible" one > where the memory requirements and time delay make > the solution useless. This is only true if you are building the DOM from a file. What if you are building up the DOM Document programmatically or else the DOM is merely an interface to structured data in a DBMS. > If the stream is only going to be "filtered", why read > the entire thing into memory before starting the > transformation process (in this case filtering)? In some cases, this is possible and even desirable. In this case of XSL the spec enforces constraints which make it impossible to be able to properly process an XML document unless it has been fully parsed into an in-memory tree structure (for most people this will be the DOM). > > Certainly XSL is best served by a DOM representation if > > the data is presented via a DOM interface. > > I would speculate to the contrary, and would think that > driving XSL with SAX would be a far better choice. No way. You are totally throwing out all of the applications that create a DOM document programmatically (such as through scripting). The alternative is to build a DOM document programmatically, write it out to XML, reparse it with an XML parser, and then process the document as SAX parsing events. This is an extra layer of indirection that is otherwise totally unnecessary if you use the DOM. The only other option is to take the DOM (an already parsed in-memory tree), parse it into SAX events using something like SAXON, and then reparse things back into another entire custom source tree. Again another layer of indirection which in languages like Java which that are particularly sensitive to unnecessary object allocation, a major cost in processing an XML document. > > The other option is to serialize everything. > > No. The option is to move to Event based processing > of streams. You can then model with "event objects" You still need to waste time recreating a source tree when you already have the DOM. Writing code to recursively spit out SAX parse events directly (instead of building a tree first) is not an easy chore and in many cases is totally impractical (you need to do everything sequentially in document order to make things work). > > This makes no sense unless the DOM implemention is sub-optimal. > > No. It's a computational complexity issue. For a > decent size stream, with a transformation that can > be done in a single-pass (XML->HTML), no DOM > implementation will even come close to an implementation > using SAX. Crunch some numbers. Already have. My results are quite the opposite. The most significant overhead to using the DOM is quite frankly dynamic method overhead for node iteration. With respect to Java and future optimizing compilers, this will become less and less of an issue. The costs for object allocation in one way or another will always be there. > If your still not convinced, read Ableson, Structure and > Interpretation of Computer Programs, ISBN 0-07-000484-6, > Section 3.5.1, page 317. There he talks about: > > "severe inefficiency with respect to both time and space". In general I would agree with your assertions, but it is an engineering fact that in languages like C++ and Java, object allocation tend to be in a lot of apps the number one performance bottleneck in applications which are not sensitive to reducing this overhead. One of the most famous examples in Java is in the Java AWT when calling getSize() over and over in paint methods. Doing this can bring some apps to a crawl as the code of java.awt.Component.getSize() looks something like this: public Dimension getSize() { return new Dimension(width, height); } java.awt.Component in JDK 2 now has getX(), getY(), getWidth(), and getHeight() methods to reduce unnecessary object allocation and which for some GUI apps has made a major difference in paint routines. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Feb 4 06:13:48 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:32 2004 Subject: Another errata? Message-ID: <3.0.32.19990203220702.00b40a80@pop.intergate.bc.ca> At 01:40 PM 2/3/99 -0500, Tyler Baker wrote: >Once you read a >document into memory and no longer preserve the original prefixes (or >rather the QName), when >you write the document back out (which has possibly been mutated) where do >you get these >prefixes? Do you simply invent them in the form a, b, c, ..., aa, ab, ac, >... etc. Exactly. I can't imagine why you think this is hard. > I suppose >the people in the "Namespaces in XML" feel that once you read in a document, you throw it >away. When read something this ludicrous, I stop reading. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Thu Feb 4 06:17:30 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:32 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) References: <001f01be4ff7$bbc30830$d3228018@jabr.ne.mediaone.net> Message-ID: <36B93A9A.EEC43FFB@manhattanproject.com> "Borden, Jonathan" wrote: > You are missing the point. Since the data is already deserialized, there is > no delay due to processing. The idea is that the data has already been > entered into a database which implements a DOM interface (the database is > free to implement other interfaces as well). I never claimed that the data > was entered as XML or that a serialized XML document ever existed. Oops! Sorry about not getting the context right. *blush* Let me try again. Anyway, I see one context where putting a DOM interface on a relational database would be great, and I see another context where I don't think it would be too hot. Interactive "generic" organizational navigator ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ I did something like this at Ford. I found that organizational navigation has these characteristics: a. Only a small fraction of the information in the database is queried. b. It is interactive, the system is best modeled as a bunch of small queries rather than one big query. c. It is hierarchical in nature (Xrefs are rare) as the user goes down a branch, you store all of the primary keys on the stack. The big problem with the implementation was that the mapping from a custom relational-database model to the DOM model is, let's say, non-trivial. On the flip side, however, things are much more cheerful. A wonderful, re-usable CORBA/DOM client could be built (with very good market potential). This results in great re-use from not only a software, but also from a training perspective. Query Extraction Layer ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ I can also see an XML stream as the result from a query. In this case, the de-normalized, hierarchical view of the information resulting from a nested query is the stuff reports are made of. I picture the client using XSL on the resulting stream to generate a report. In effect, the XML output stream becomes what you would normally call a view. Then, each "user" can filter/format the information as they would like using a standardized XSL based client. I don't think this is new... and I'm looking forward to mature technology doing this. However, I'm not sure about using DOM as an interface to a relational database in this context. The way Oracle and other databases compute large queries with subordinate tables is, at its heart, stream-oriented. Many of these databases will "drive" off a single table sequentially, collecting the information from the subordinate tables as the query progresses. Therefore, I picture a DOM implementation using thousands of nested queries to generate the same tree that a few large queries would have handled nicely. In this case, the database engine would not be able to take advantage of aggregate indexing and elimination algorithems. In effect, negating the benifits of having corporate information in a relational database. *smile* Anyway, I just can't picture using DOM in this context as an interface to a relational database. For this case I feel using a stream-oriented solution on the server with an object-oriented event processing system on the client seems the better approach. :) Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Thu Feb 4 07:02:24 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:32 2004 Subject: A weaker XSL? (Was: Storing Lots of Fiddly Bits ) References: <001801be4fe8$90127bd0$d3228018@jabr.ne.mediaone.net> <36B91D93.83627AFE@manhattanproject.com> <36B936DF.495F9CAE@infinet.com> Message-ID: <36B94536.2E9272C8@manhattanproject.com> Tyler Baker wrote: > In this case of XSL the spec enforces constraints which make it impossible > to be able to properly process an XML document unless it has been fully > parsed into an in-memory tree structure (for most people this will be the DOM). Wow! Not what I had expected. I guess I have much to learn. *smile* Anyway, I have a negative-gut reaction to such a strong requirement. It comes from being burned on many occassions -- it is very easy to underestimate the memory size and processing time required to translate a stream into an object for further minipulation. Is there a possibility for creating a sub-set of XSL that would work on a stream instead of requiring a complete document object? I picture a database doing all of the sorting and other non-stream operations before/as the XML is created. Thus, the sub-set of XSL should be capable of being driven from a SAX equivalent stream event observer. :) Clark Evans xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murata at apsdc.ksp.fujixerox.co.jp Thu Feb 4 08:05:19 1999 From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto) Date: Mon Jun 7 17:08:32 2004 Subject: Control Characters Message-ID: <199902040804.AA03400@murata.apsdc.ksp.fujixerox.co.jp> >Why are the control characters x80-x9F allowed in XML character data, while >x0-x8,xB,xC,xE-x1F are illegal? Is it that the illegals have meanings that >XML does not support? Just wondering. I am afraid that you are right. XML should have disallowed C0 control codes and C1 control codes except CR, LF, and HT, since the Unicode standard does not define semantics of these control codes. U+007F should have been disallowed as well. However, I do not think that this causes practical problems. Cheers, Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata@apsdc.ksp.fujixerox.co.jp xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Feb 4 08:08:30 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:32 2004 Subject: Another errata? References: <3.0.32.19990203220702.00b40a80@pop.intergate.bc.ca> Message-ID: <36B9551D.5974523F@infinet.com> Tim Bray wrote: > At 01:40 PM 2/3/99 -0500, Tyler Baker wrote: > >Once you read a > >document into memory and no longer preserve the original prefixes (or > >rather the QName), when > >you write the document back out (which has possibly been mutated) where do > >you get these > >prefixes? Do you simply invent them in the form a, b, c, ..., aa, ab, ac, > >... etc. > > Exactly. I can't imagine why you think this is hard. But then the XML output is not what I would call readable anymore which violates goal number 6 in the XSL draft: "XML documents should be human-legible and reasonably clear." Once a document has been read into memory it may lose all of the original structure (maybe transformed is not the right word here). You no longer preserve the prefix names that the original author of the content (or even DTD) had intended. What you write out to XML may no longer conform to any DTD. > > I suppose > >the people in the "Namespaces in XML" feel that once you read in a document, you throw it > >away. > > When read something this ludicrous, I stop reading. -Tim What I mean is simply, once you read in a document, you process it and then you are done with it. I suppose "throwing it away" was a little inaccurate. Sorry for the misunderstanding here, Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matt at veosystems.com Thu Feb 4 08:09:50 1999 From: matt at veosystems.com (matt@veosystems.com) Date: Mon Jun 7 17:08:32 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) In-Reply-To: <36B92536.D5BF3AA0@prescod.net> from "Paul Prescod" at Feb 3, 99 10:42:30 pm Message-ID: <19990204080821.21049.qmail@veosystems.com> A non-text attachment was scrubbed... Name: not available Type: text Size: 3698 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990204/2d7f48ef/attachment.bat From donpark at quake.net Thu Feb 4 08:17:42 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:08:32 2004 Subject: Restricted Namespaces for XML Message-ID: <003101be5016$ad13c850$2ee044c6@arcot-main> If I read James Clark's message correctly, I believe he is in favor of chopping out some features out of the "Namespaces for XML" as well as some sections so I believe we might be on a good track. Another crazy idea I had was to remove the use of URI while keeping the basic style. The idea is to just use prefixes as the qualifier. I don't think the need for universally unique names is paramount. Default namespace can be defined by stating what the default prefix is. For example, <html xmlns="html" xmlns:html="html" xmlns:ck="ck"> <head> <ck:cookie>some cookie info</ck:cookie> </head> </html> All the namespace declarations must be at the root element and prefixes are global to the document (well, within the root element). The result is something that conforms to the "Namespaces for XML" yet easy to understand and use although some flexibilities are lost. [snip] >that is a shame. Lets try to make some drinkable wine out of sour grapes. I am the sort of guy whose adrenaline kicks in when everything seems to go wrong. It doesn't matter if the result sucks as long as you do your best and not get sidetracked by useless negative emotions. Best, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From spreitze at parc.xerox.com Thu Feb 4 08:33:16 1999 From: spreitze at parc.xerox.com (spreitze@parc.xerox.com) Date: Mon Jun 7 17:08:32 2004 Subject: A New Hope (was Re: Storing Lots of Fiddly Bits (was Re: What is XML for?)) In-Reply-To: <19990204080821.21049.qmail@veosystems.com> Message-ID: <99Feb4.003256pst."834439"@idea.parc.xerox.com> Right! I think a significant part of the problem here is that people are realizing that XML's data model is not as expressive as they'd like. For example, XML's entity structure looks like a semi-labelled graph (vertices are labelled (with entity tags) but edges are not labelled), whereas many other data models (e.g., RDF) let you label both the edges and the vertices. Sure, you can encode any other data model into XML, and we'll probably have to as long as we're dealing with XML 1.0. In fact, I've already seen strong proponents of multiple, different ways of encoding fully labelled graphs into XML. But programmers would rather deal with application-domain constructions in more expressive data modelling systems (e.g., one that supports fully labelled graphs). No amount of XML-to-XML transformation and/or efficiency hacking is going to change this. It seems to me that one plausible way out of this conundrum is for the XML Schema WG to recognize (1) the need for schemas to be written in terms of more expressive data modelling systems, (2) the need to support a variety of encodings of those data models into XML, and (3) the need for a schema to describe a particular encoding (the one desired for the schema at hand) of the schema's data model into XML. Mike xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Thu Feb 4 09:50:30 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:08:32 2004 Subject: Fw: Namespaces Message-ID: <007f01be5023$1a9d33a0$5402a8c0@oren.capella.co.il> OK, I see I'm totally lost here. Question: Is it possible to recast the namespace recommendation as a transformation from an XML tree with 'xmlns' attributes and '...:' prefixes into a tree which doesn't have them, but with modified element and attribute names, such that the semantics of the resulting tree under the rest of the relevant recommendations (ignoring namespaces) is preserved? One would of course have to pass the DTDs (or other schema files) through the same transformation. Note that this may require defining a textual form for the transformed tree (using "...^...", or "{..}..", or whatever). If so, then we'll have a clear definition of just how to add a namespace processor on top of a normal XML processor. The reverse transformation could be used for emitting XML trees. The rest of the XML standards could pretty much ignore namespaces altogether. The endless threads of "what are namespaces this week" would go away. Yes, I know, I'm dreaming (or raving :-) If this isn't possible, and from what I'm reading this seems a real possibility, then I'd like to know the details - in particular, I want to hear the advantages gained by this decision. Is it some attempt to mix together namespaces and other issues - such as mixed content, combining documents, extending DTDs, etc.? MVHO is that such issues should be solved separately of the unique naming issue, but I realize that the W3C has a tradition of mixing together issues which seem separate to me :-) James' document uses the transformation approach. It stops short of claiming the preservation of semantics. When I asked about a particular equivalence issue (relationship between an attribute name and an element name, given that none/one/both of them are expanded), he said: > I would prefer not to answer this since I don't think the XML Namespaces > Recommendation needs to take a position on this. All the Namespaces > Recommendation does is provide a mechanism which allows element type > names and attribute names to be qualified with a URI. How other > applications or specifications (such as RDF or XML Schemas) choose to > exploit this mechanism is up to them. > > However, if I was forced to answer, I would say that the relationship > was not the same. These two statements seem contradictory to me - if "all" the namespaces do is prefix the names with a URI, why should the relationship between expanded names be different then that between "normal" names? If this relationship is "application defined" for normal names (as James implies), then doesn't it remain "application defined" when the names are expanded? Anyway, how come it is "application dependent" - don't DTDs and schema language have a lot to say about it? My head is starting to ache. Way in over my head, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nate at valleytel.net Thu Feb 4 10:17:19 1999 From: nate at valleytel.net (Nathan Kurz) Date: Mon Jun 7 17:08:33 2004 Subject: A weaker XSL? In-Reply-To: <36B94536.2E9272C8@manhattanproject.com> from "Clark Evans" at Feb 04, 1999 06:59:02 AM Message-ID: <199902041016.EAA06857@trinkpad.valleytel.net> Clark Evans wrote: > DOM requires the entire stream be read before the > the document object is returned and processing can begin. > Not only does this chew significant memory for very large > streams, but it causes significant delay before output > could be generated. In the worst case, it turns a > perfectly simple problem into an "impossible" one > where the memory requirements and time delay make > the solution useless. Tyler Baker wrote: > In this case of XSL the spec enforces constraints which make it > impossible to be able to properly process an XML document unless it > has been fully parsed into an in-memory tree structure (for most > people this will be the DOM). I apologize for jumping into this so late, but these two statements have me a bit worried. Are the both true? I'm still in the XML fiddling around stage, but I've read both of the specs and didn't draw the same conclusions. I was hoping these specs allowed for more wriggle room than this. The first statement regarding the DOM model seems true word by word, but seems a bit misleading. First, couldn't a bit of judicious preprocessing allow some of those very large streams to be made into very small streams before the DOM model is built? The spec says that the DOM structure model is 'structurally isomorphic' to the document, but surely this document doesn't have to be a pre-existing XML file? Also, while the entire stream has to have been read, does it have to have already been processed? The way I was interpretting the spec, the DOM model didn't exclude a lazy processing method. So long as an implementation provides a compliant interface, can't it do anything it wants with the data, even so far as to put off processing information until it is requested? I had hoped for an extremely lazy DOM implementation that would maintain information about all but the root level nodes in a 'flat' unprocessed state a request for that information is made. For many cases (well, at least the ones I'm envisioning) such an implemention would be much more efficient than an entirely pre-processed one. Is this sort of implemention just right out of the question? As for the second statement (regarding XSL), could these constraints be more explicitly laid out? While I can see that arbitrary XSL might require a fully constructed tree, couldn't one come up with many cases where a partially constructed tree would be sufficient? For example, what if your style sheet had only the following template: <xsl:template match="/match"> Found a match! </xsl:template> Would one still have to fully construct your tree ahead of time? Hoping I'm not too far off base but half-expecting that I must be, nathan kurz nate@valleytel.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Thu Feb 4 11:03:02 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:08:33 2004 Subject: A weaker XSL? Message-ID: <002401be502d$c0c1cf70$2ee044c6@arcot-main> >> DOM requires the entire stream be read before the >> the document object is returned and processing can begin. The DOM spec does NOT require the entire stream to be read before the document object is returned. Some of the DOM implementations available today does indeed process the entire stream before returning but that is a quality of implementation issue. Best, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Thu Feb 4 11:10:27 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:08:33 2004 Subject: Restricted Namespaces for XML References: <003101be5016$ad13c850$2ee044c6@arcot-main> Message-ID: <36B97B5E.9FF8B012@jclark.com> Don Park wrote: > If I read James Clark's message correctly, I believe he is in favor of > chopping out some features out of the "Namespaces for XML" I am in favour of chopping out some of the conceptual apparatus used to explain the features of "Namespaces for XML" not chopping out the features themselves. > All the namespace declarations must be at the root element and prefixes are > global to the document (well, within the root element). Although restricting namespace declarations to the root element makes it a little easier to do namespace processing on input, it makes things much harder on output if you want to generate a document that combines multiple XML input documents. For example, consider an XLink filter that resolves transclusions. This restriction would require that it read all the input XML documents completely before it could produce any output, because it would need to move the declarations of the prefixes on the transcluded document roots up to the root of the generated output document. I also doubt it would be any easier to understand: you're making a special case of the root element rather than treating all elements uniformly. I can't think of any feature in XML Namespaces that isn't really important for some significant application of XML Namespaces. I recommend reading http://www.w3.org/TR/NOTE-webarch-extlang for some background on the requirements that motivated the design of XML Namespaces. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Thu Feb 4 11:12:01 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:08:33 2004 Subject: Namespaces Message-ID: <01BE5036.8F0220D0@grappa.ito.tu-darmstadt.de> Oren Ben-Kiki wrote: > Is it possible to recast the namespace recommendation as a transformation > from an XML tree with 'xmlns' attributes and '...:' prefixes into a tree > which doesn't have them, but with modified element and attribute names, such > that the semantics of the resulting tree under the rest of the relevant > recommendations (ignoring namespaces) is preserved? One would of course have > to pass the DTDs (or other schema files) through the same transformation. This transformation is possible, but I don't think it buys you anything. If the transformation results in locally unique names, you break applications, since these can no longer recognize the locally unique names, which potentially change each time a document is processed. If it results in globally unique names, you could rewrite the applications to recognize these names, but you aren't really gaining anything over current namespace processing. > If this isn't possible, and from what I'm reading this seems a real > possibility, then I'd like to know the details - in particular, I want to > hear the advantages gained by this decision. Is it some attempt to mix > together namespaces and other issues - such as mixed content, combining > documents, extending DTDs, etc.? MVHO is that such issues should be solved > separately of the unique naming issue, but I realize that the W3C has a > tradition of mixing together issues which seem separate to me :-) Mixed content, combining documents, extending DTDs, etc. are not directly addressed by the namespaces spec, nor does the spec claim to solve them. They are closely tied because all of these need namespaces in order to be solved. This is generally where confusion in the namespaces spec comes from -- the hope that the namespaces spec directly addresses these issues and the confusion when the reader finally realizes that in "only" gives a syntax for a two-part naming system. > > However, if I was forced to answer, I would say that the relationship > > was not the same. For those of you who have forgotten the original question (I had to look it up), it essentially asks: Is the relationship between "good" and "a" the same in <good a="1"/> and <good foo:a="1"/>? > These two statements seem contradictory to me - if "all" the namespaces do > is prefix the names with a URI, why should the relationship between expanded > names be different then that between "normal" names? If this relationship is > "application defined" for normal names (as James implies), then doesn't it > remain "application defined" when the names are expanded? Anyway, how come > it is "application dependent" - don't DTDs and schema language have a lot to > say about it? Expansion has nothing to do with it. The namespaces spec introduces the concept of a "global" attribute, which doesn't exist in XML 1.0. (For a discussion of global attributes, see Andrew Layman's summary at http://www.lists.ic.ac.uk/hypermail/xml-dev/9902/0027.html.) You can tell which attributes are global and which are local by looking for prefixes: global attributes have a prefix and local attributes don't. Another characteristic is that global attributes are in a different (traditional) namespace than local attributes. That is, global attribute names must be unique among all global attribute names in the document, while local attribute names must be unique among all local attributes in each element. Because the (traditional) namespaces of local and global attributes do not overlap, so you can have a global attribute that has the same name as a local attribute. That is, the following is legal: <good a="1" foo:a="1"/> So, your question boils down to: "Is the relationship of good to a the same as the relationship of good to foo:a?" James rightly side-steps the question, because you are asking about the relationship between an element and two *different* attributes which *happen* to have the same name. This question has nothing to do with the namespaces spec (how the names of these attributes are expressed -- prefix form, Clark notation, etc. -- is irrelevant), but with the actual, application-defined semantics of the attributes. James then guesses that they are probably different and goes on to explain why. Does your head hurt any less now? -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Thu Feb 4 11:15:28 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:08:33 2004 Subject: Fw: Namespaces References: <A26F84C9D8EDD111A102006097C4CD0D05494F@SOHOS002> <36B9302C.390A6F70@jclark.com> Message-ID: <36B981F0.D5546156@mecomnet.de> The confusion issues from the spec itself. In order to eliminate the origin of the problem, either (A) 1. the definition in section 1 is editied to delete the text: "XML namespaces differ from the "namespaces" conventionally used in computing disciplines in that the XML version has internal structure and is not, mathematically speaking, a set. These issues are discussed in 'A. The Internal Structure of XML Namespaces'. " 2. the first paragraph in 5.2 is modified to say "unprefixed attributes are in no namespace", or "unprefixed attributes are in the null namespace", or something of that order instead of (to the effect) merely "unprefixed attributes are not in the default namespace" 3. the caption to the second exmple in 5.3 is modified to make an analogous positive assertion rather than merely "the default namespace does not apply to attribute names". 4. appendix A is eliminated. or (B) 5. the passages noted in 2 and 3 above are edited to incorporate the "per element partition" terminology. 6. claims, that "in the sense the spec uses the word namespace, an unprefixed attribute is NOT IN ANY NAMESPACE", are abandoned. It simply doesn't work to have the text referred to in items 1 through 4 above present in the same specification. James Clark wrote: > > All this stuff about namespaces is just unnecessary, confusing > complexity that invites the over-analysis that is so prevalent in this > forum. If the spec was called something like "XML Universal Names" and > never mentioned the word "namespace" and didn't include Appendix A > (which thankfully is not normative), absolutely no functionality would > have been lost and I think there would have been far less confusion. We do agree on this last point. My very first notes about namespaces (I think it was likely almost a year ago) included a query as to why "per element partitions" for *names* were even necessary. We agree. They are not. It appears to the uninitiated, however, that the authors had cause to make distinctions among the *names* of unqualified attributes themselves. Which distinction the Appendix A text very clearly makes, and which the spec supports by reference. There have even been notes posted which led me to believe that this was intended to support certain XSL expressions. Which leads the uninitiated to believe there is cause to support this distinction in an implemented DOM. Should this not be the case, then the option (A) above should be pursued. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Feb 4 11:56:46 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:33 2004 Subject: Fw: Namespaces In-Reply-To: <007f01be5023$1a9d33a0$5402a8c0@oren.capella.co.il> References: <007f01be5023$1a9d33a0$5402a8c0@oren.capella.co.il> Message-ID: <14009.34222.566591.606284@localhost.localdomain> Oren Ben-Kiki writes: > Question: > > Is it possible to recast the namespace recommendation as a > transformation from an XML tree with 'xmlns' attributes and '...:' > prefixes into a tree which doesn't have them, but with modified > element and attribute names, such that the semantics of the > resulting tree under the rest of the relevant recommendations > (ignoring namespaces) is preserved? Yes -- this is exactly how most people are already working with namespaces. It's a well-proven technique for working with architectural forms in SGML (except that architectural forms allow 0-n while namespaces allow exactly 1). > One would of course have to pass the DTDs (or other schema files) > through the same transformation. No -- the DTD disappears after the initial parse; it is used to validate the surface structure of the original document, but is not part of the transformation. The point, though, is that this transformation occurs only in memory (or in database) -- if you write it back out as XML, you have to shove prefixes back on again (they don't have to be the same, since the prefix is just fluff). > Note that this may require defining a textual form for the > transformed tree (using "...^...", or "{..}..", or whatever). That's one alternative; the other is to make names into the equivalent of public interface Name { public abstract String getURIPart (); public abstract String getLocalPart (); } Personally, I'm partial to using a simple string with the space character, as in "http://www.megginson.com/ns/ foo" Others have different preferences. Simple concatentation will not work, because {http://www.foo.com/foo}bar and {http://www.foo.com/}foobar would not be properly distinguished. > If so, then we'll have a clear definition of just how to add a > namespace processor on top of a normal XML processor. The reverse > transformation could be used for emitting XML trees. The rest of > the XML standards could pretty much ignore namespaces > altogether. The endless threads of "what are namespaces this week" > would go away. Yes, I know, I'm dreaming (or raving :-) No, you're awake, and you're right -- it's *really* that easy. I imagine that Tim Bray is probably slapping his forehead right now yelling "DUH!", and that James Clark is probably doing whatever the polite English equivalent is. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Thu Feb 4 12:21:28 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:08:33 2004 Subject: Fw: Namespaces References: <007f01be5023$1a9d33a0$5402a8c0@oren.capella.co.il> Message-ID: <36B987C3.D79BEFF1@jclark.com> Oren Ben-Kiki wrote: > Is it possible to recast the namespace recommendation as a transformation > from an XML tree with 'xmlns' attributes and '...:' prefixes into a tree > which doesn't have them, but with modified element and attribute names, Yes. The transformation changes some element type names and attribute names from strings to structured objects that contain a URI and a string. There are some subtleties (an XML editor would probably want to be able to preserve prefixes; some applications need to now what namespace prefix bindings are in effect on a particular element) but these can be handled within this approach. > such > that the semantics of the resulting tree under the rest of the relevant > recommendations (ignoring namespaces) is preserved? I don't understand what you mean by "preserving semantics". > One would of course have > to pass the DTDs (or other schema files) through the same transformation. Why? DTDs are used only at the pre-transformation stage. I would expect a future XML Schema language to operate purely on the post-transformation tree. > if "all" the namespaces do > is prefix the names with a URI, Qualify not prefix. A URI-qualified name is not a string but a URI/string pair. > why should the relationship between expanded > names be different then that between "normal" names? Because it's URI-qualified and therefore capable of independent interpretation. A URI-qualified name is a different kind of object from a "normal" name. > Anyway, how come > it is "application dependent" I said it was dependent on other applications *or other specifications*. > don't DTDs and schema language have a lot to > say about it? DTDs don't have anything to say about it because they don't know about namespaces. I would expect a namespace-aware Schema language to have a lot to say about it. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Thu Feb 4 12:21:35 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:08:33 2004 Subject: Restricted Namespaces for XML Message-ID: <01BE5040.5CB976A0@grappa.ito.tu-darmstadt.de> Don Park wrote: > Such a spec might dictate that all namespace declarations be at the root > element (XML fragments are problematic but...). This restriction has the > side effect of not allowing duplicate prefixes. The major benefit of this proposal is that it reduces the number of checks for xmlns attributes. This savings is minimal in small documents or documents with few attributes, but it would be interesting to know how much xmlns attribute processing costs in a large, attribute-intensive document. I think readability is a wash, as you can't do anything more with this than you can with the current proposal and you lose the ability to have multiple default namespaces, which are useful in documents that have long sections alternating between two or more namespaces. I think the biggest problem is, as James Clark noted elsewhere, complication of fragmentation. Since I believe fragments to be a big part of the future, I don't like anything that will make them harder. That said, if anybody had some real numbers about what xmlns attribute processing costs in the worst case and this turned out to be significant, it might be useful to have a PI that tells the namespace processor whether it needs to look beyond the root element. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Thu Feb 4 13:10:32 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:08:33 2004 Subject: Restricted Namespaces for XML Message-ID: <003d01be503f$5448ee20$2ee044c6@arcot-main> >I am in favour of chopping out some of the conceptual apparatus used to >explain the features of "Namespaces for XML" not chopping out the >features themselves. I appologize for misrepresenting your position. >Although restricting namespace declarations to the root element makes it >a little easier to do namespace processing on input, it makes things >much harder on output if you want to generate a document that combines >multiple XML input documents. For example, consider an XLink filter >that resolves transclusions. This restriction would require that it >read all the input XML documents completely before it could produce any >output, because it would need to move the declarations of the prefixes >on the transcluded document roots up to the root of the generated output >document. I also doubt it would be any easier to understand: you're >making a special case of the root element rather than treating all >elements uniformly. You are right, of course, if usage of well-formed external entity is not allowed. I was not proposing that RNX replace the Namespace spec. I was trying to see there was a need for a strict subset of the Namespace spec which can not handle some of the problems but is functional enough for most problems. RNX software is easier to write because there is not much to get confused about. >I can't think of any feature in XML Namespaces that isn't really >important for some significant application of XML Namespaces. > >I recommend reading > > http://www.w3.org/TR/NOTE-webarch-extlang > >for some background on the requirements that motivated the design of XML >Namespaces. Thanks for the link. Thanks, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From shecter at darmstadt.gmd.de Thu Feb 4 13:27:31 1999 From: shecter at darmstadt.gmd.de (Robb Shecter) Date: Mon Jun 7 17:08:33 2004 Subject: Component Markup Language Message-ID: <4EB3E7E7.1B5AD737@darmstadt.gmd.de> Hi, Has anyone thought about or worked on an markup language to describe a User Interface in a platform independent way? I'd think that this would language would describe: instantiating components, adding them to containers, and configuring interactions between them. Thanks, - Robb xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Thu Feb 4 13:36:33 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:08:33 2004 Subject: Component Markup Language Message-ID: <5F052F2A01FBD11184F00008C7A4A80001136AF0@eukbant101.ericsson.se> > -----Original Message----- > From: Robb Shecter [SMTP:shecter@darmstadt.gmd.de] > > Hi, > > Has anyone thought about or worked on an markup language to describe a > User Interface in a platform independent way? > > I'd think that this would language would describe: instantiating > components, adding them to containers, and configuring interactions > between them. > Two projects I know of: XUL from mozilla.org - lots of details on their web site, and XGTK - an XML interface definition for GTK that uses Perl to create applications. The latter is in alpha stage and only available via CVS from gnome.org, but looks like an interesting project for perl developers. XUL looks like it will be very powerful too, and allows the control of events in Javascript. Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From prb at uic.edu Thu Feb 4 13:43:50 1999 From: prb at uic.edu (Paul R. Brown) Date: Mon Jun 7 17:08:33 2004 Subject: Component Markup Language Message-ID: <010101be5043$b5c24030$9209f880@razzmatazz.math.uic.edu> >Has anyone thought about or worked on an markup language to describe a >User Interface in a platform independent way? It's not a markup language (or a subset of SGML), but python does a reasonably good job of providing platform-independent UI. - Paul xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From AOgan at us.lhsgroup.com Thu Feb 4 13:50:58 1999 From: AOgan at us.lhsgroup.com (Ogan, Arif) Date: Mon Jun 7 17:08:33 2004 Subject: Component Markup Language Message-ID: <6030DEFB8E88D211903300805F57D1463F8B60@excatl01.us.lhsgroup.com> Check out http://www.bluestone.com/xml/XwingML/ Arif ---------- From: Robb Shecter [SMTP:shecter@darmstadt.gmd.de] Sent: Friday, November 04, 2011 8:26 AM To: xml-dev@ic.ac.uk Subject: Component Markup Language Hi, Has anyone thought about or worked on an markup language to describe a User Interface in a platform independent way? I'd think that this would language would describe: instantiating components, adding them to containers, and configuring interactions between them. Thanks, - Robb xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dave at userland.com Thu Feb 4 14:08:51 1999 From: dave at userland.com (Dave Winer) Date: Mon Jun 7 17:08:33 2004 Subject: Component Markup Language In-Reply-To: <6030DEFB8E88D211903300805F57D1463F8B60@excatl01.us.lhsgrou p.com> Message-ID: <3.0.6.32.19990204061222.00f1de60@scripting.com> http://www.bluestone.com/xml/XwingML/ Yes, that is very interesting. But if anyone from Bluestone is listening, it would be great to have a page that shows XML code and a screen shot of the interface it generates. Four or five such examples, visible to someone who doesn't use their tool, would be very instructive and would help their cause immeasurably. Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Thu Feb 4 14:14:22 1999 From: Michael.Kay at icl.com (Michael.Kay@icl.com) Date: Mon Jun 7 17:08:33 2004 Subject: A weaker XSL? Message-ID: <93CB64052F94D211BC5D0010A80013310EB2D8@WWMESS3> > Is there a possibility for creating a sub-set of > XSL that would work on a stream instead of > requiring a complete document object? Funny you should ask that, I've been experimenting over the last few days to see whether I could build such a thing on top of SAXON. Not actually easy to do as a pure subset, but I think one can do something that feels quite XSL-like. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Thu Feb 4 14:14:32 1999 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:08:33 2004 Subject: Component Markup Language Message-ID: <009001be5048$3508f740$a7addccf@ix.netcom.com> The W3C HTML WG is currently working on converting HTML from an SGML application to an XML based application, and will be redesigning the DTD's so that they can be modularized. These modules can then be combined into various profiles which can include user-defined modules. Frank Frank Boumphrey XML and style sheet info at Http://www.hypermedic.com/style/index.htm Author: - Professional Style Sheets for HTML and XML http://www.wrox.com CoAuthor: XML applications from Wrox Press, www.wrox.com Author: Using XML on the Web (March) ----- Original Message ----- From: Robb Shecter <shecter@darmstadt.gmd.de> To: <xml-dev@ic.ac.uk> Sent: Friday, November 04, 2011 8:25 AM Subject: Component Markup Language >Hi, > >Has anyone thought about or worked on an markup language to describe a >User Interface in a platform independent way? > >I'd think that this would language would describe: instantiating >components, adding them to containers, and configuring interactions >between them. > >Thanks, >- Robb > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Thu Feb 4 14:15:07 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:08:34 2004 Subject: COBOL XML parser? Message-ID: <199902041412.JAA10888@hesketh.net> Every month or so I get a ping from a reader asking about parsers in various languages outside the holy quadrilateral of Java, C/C++, Perl, and Python. So far I've had three COBOLs and a FORTRAN. Does anyone know of parsers (or parser-like projects) in these languages? I know there's a Delphi parser out there, so my old friend Pascal is in the loop, as well as JavaScript and I think even VB, but how about: * COBOL * FORTRAN * Ada * Lisp * Smalltalk I know Unicode and other issues may cause some big problems for the construction of a 'true' parser in most of these, but I'd love to hear it if anyone's tried. Simon St.Laurent XML: A Primer / Building XML Applications (March) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pierlou at CAM.ORG Thu Feb 4 14:30:33 1999 From: pierlou at CAM.ORG (Pierre Morel) Date: Mon Jun 7 17:08:34 2004 Subject: Component Markup Language Message-ID: <003c01be504a$702864d0$02dcdcdc@pierre> Hi, I work on a project like that for a while, look at : http://www.pierlou.com/prototype Pierre Morel ----- Original Message ----- From: Robb Shecter <shecter@darmstadt.gmd.de> To: <xml-dev@ic.ac.uk> Sent: 4 novembre, 2011 08:25 Subject: Component Markup Language >Hi, > >Has anyone thought about or worked on an markup language to describe a >User Interface in a platform independent way? > >I'd think that this would language would describe: instantiating >components, adding them to containers, and configuring interactions >between them. > >Thanks, >- Robb > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Thu Feb 4 14:47:38 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:34 2004 Subject: Component Markup Language Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054959@SOHOS002> > >Has anyone thought about or worked on an markup language to > describe a > >User Interface in a platform independent way? I had this idea for one called HTML but abandoned it because I thought it probably wouldn't catch on. Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From shecter at darmstadt.gmd.de Thu Feb 4 14:49:04 1999 From: shecter at darmstadt.gmd.de (Robb Shecter) Date: Mon Jun 7 17:08:34 2004 Subject: COBOL XML parser? References: <199902041412.JAA10888@hesketh.net> Message-ID: <4EB3FABA.80F5A127@darmstadt.gmd.de> "Simon St.Laurent" wrote: > > > * Smalltalk > I've been looking for Smalltalk parsers for a while, and the only one I've ever seen a pointer to is at: http://www.indelv.com/ Some caveats: Not validating. Nowhere near as well documented as something like Java ProjectX or IBM's Parser. The company itself seems to advise using their Java parser instead of the Smalltalk version. Only works in two Smalltalk vendors' platforms: VW and IBM. This is too bad - I'd think that doing something like manipulating a DOM could be much more pleasant in Smalltalk than in Java. I still haven't seen any pointers for anything like an XSL parser for Smalltalk. - Robb xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Thu Feb 4 15:13:18 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:08:34 2004 Subject: Fw: Namespaces Message-ID: <002701be5050$1b09deb0$5402a8c0@oren.capella.co.il> I asked: >> Is it possible to recast the namespace recommendation as a transformation >> from an XML tree with 'xmlns' attributes and '...:' prefixes into a tree >> which doesn't have them, but with modified element and attribute names, >such >> that the semantics of the resulting tree under the rest of the relevant >> recommendations (ignoring namespaces) is preserved? One would of course >have >> to pass the DTDs (or other schema files) through the same transformation. And got - self contradictory responeses! For example: Ronald Bourret <rbourret@ito.tu-darmstadt.de> wrote: > This transformation is possible, but I don't think it buys you anything. Except compatibility with current APIs - the ability to add a namespace processor on top of SAX, say, _without changing the interface_. But let this slide... >For those of you who have forgotten the original question (I had to look it >up), it essentially asks: > >Is the relationship between "good" and "a" the same in <good a="1"/> and ><good foo:a="1"/>? > >Expansion has nothing to do with it. Which is contradicted by the very next sentence: >The namespaces spec introduces the >concept of a "global" attribute, which doesn't exist in XML 1.0. AHA! So the namespaces proposal _DOES_ go beyond simply qualifying names with URIs! Which means I _CAN NOT_ transform an XML document with namespaces into a document without namespaces (but with extended names) and achieve the same semantics. Let me clarify what I mean by that: Suppose I have an XML processing system, with some XML input files (including DTDs, schema definition files, input files, stylesheet files, whatever). Let us assume that they (or some of them) use namespaces as per the current draft. I run the system and get certain effects, including possibly some output files which again may use namespaces. Now, I transform all the input files to stop using namespaces. Instead I transform them to something like James' extended names format. I hack the XML system to _not_ do namespace processing, if necessary, and re-run it on the input files. I'd expect the effects to be identical and the XML output files to be the same the original run's output files, after compatible name extension. If this were to hold, namespaces would have been simple, and truly orthogonal to all the other standards. "This turns out not to be the case", but I can't get anyone to admit it - so I can't get a straight answer to _why_ this isn't the case. Why is this impossible? Because "prefixed attributes are global". The XML system handles them differently based on the fact they are _prefixed by a namespace_. In the second run _no_ attribute is prefixed. Note that the _uniqueness_ of the names wan't hurt; "foo:a", "a" and "bar:a" are still different. It is just that none of them are "namespaced". >(For a >discussion of global attributes, see Andrew Layman's summary at >http://www.lists.ic.ac.uk/hypermail/xml-dev/9902/0027.html.) I re-checked this and as far as I understand Andrew makes a pretty good case that there is no such thing as a global attribute - at least, in DTDs there isn't. Now, if global attributes are necessary, or all attributes should be global, or whatever, is something I'd be happy to see discussed. But why does this have to be dragged into the namespace issue? Why isn't it defined so it would survive the transformation I described above? Move this "global attribute" business to the DTD or Schema WG where it belongs! >You can tell which attributes are global and which are local by looking for >prefixes: global attributes have a prefix and local attributes don't. That's one answer - a way to know which is which without looking at a DTD. IMVHO, this isn't worth the price. Look at the amount of misunderstanding this caused! Not to mention the implementation issues - without being able to do the transformation I described above, all the XML APIs have to be reworked, and so on... No namespace unaware application will be able to survive the transfer to the namespace-aware world. Surely there's a better reason for such a profound decision? >So, your question boils down to: "Is the relationship of good to a the same >as the relationship of good to foo:a?" James rightly side-steps the >question, because you are asking about the relationship between an element >and two *different* attributes which *happen* to have the same name. No. First, "a" and "foo:a" are _not_ the same name. Second, it would have been the same question if I used "a" and "foo:b". I now understand that the latter is "global" and therefore these relationships are _not_ identical. I still reeling from the shock. > This question has nothing to do with the namespaces spec (how the names of these >attributes are expressed -- prefix form, Clark notation, etc. -- is >irrelevant) <censored/> How can you say that this has nothing to do with the namespaces spec when it is (i) introduced in this very spec and (ii) relies on the attribute having a namespace prefix in order to declare it as global? >Does your head hurt any less now? Just about to bust at the seams, thanks :-) Even James, which normally admirably clears things up in one or two well-designed sentences, left me confused. To my original question (is it possible to do a transformation...) he answered: > Yes. The transformation changes some element type names and attribute > names from strings to structured objects that contain a URI and a > string. Which is _not_ what I had in mind - I was specifically referring to a _textual representation valid under the non-namespaced XML-1.0 spec_. But since one can convert qualified names to text - he has given a sample in his paper, after all - lets continue... >> such >> that the semantics of the resulting tree under the rest of the relevant >> recommendations (ignoring namespaces) is preserved? >I don't understand what you mean by "preserving semantics". In the sense of equivalence I defined above. That is, namespace-semantics(XML) == non-namespace-semantics(extended-names(XML)). >> One would of course have >> to pass the DTDs (or other schema files) through the same transformation. > Why? DTDs are used only at the pre-transformation stage. WHAT? How can DTDs be pre-transformation? If a DTD refers to 'foo:a', where 'xmlns:foo="my:uri"', and my document contains 'bar:a', where 'xmlns:bar="my:uri"', it won't match since 'foo' != 'bar'? Surely I misunderstand? >I would expect >a future XML Schema language to operate purely on the >post-transformation tree. Well, that helps a bit, though I don't see why existing languages can't share the privilege... >> if "all" the namespaces do >> is prefix the names with a URI, >Qualify not prefix. A URI-qualified name is not a string but a >URI/string pair. Fine. I'm still trying to see how can one say that "all" namespaces do is "qualify names" - which with appropriate textual representation can become "prefix the names" - and still say that they introduce global attributes: >> why should the relationship between expanded >> names be different then that between "normal" names? >Because it's URI-qualified and therefore capable of independent >interpretation. A URI-qualified name is a different kind of object from >a "normal" name. Hardly a sufficient answer. What is the _benefit_ of distinguishing between the two? As opposed of declaring global attributes in a DTD, or whatever? >> Anyway, how come >> it is "application dependent" >I said it was dependent on other applications *or other specifications*. Whatever. I still don't see what applications have to do with it. _From the XML standard point of view_, there's a certain relationship between attributes and elements. This relationship might well be "the application may do anything it wants". With regard to URI-prefixed attributes, the namespaces XML makes it clear that this relationship is _not_ the same as that for unprefixed attributes. Now how much sense is in that, if both relationships are "the application can do whatever it wants"? >> don't DTDs and schema language have a lot to >> say about it? >DTDs don't have anything to say about it because they don't know about >namespaces. I would expect a namespace-aware Schema language to have a >lot to say about it. Why should the DTD spec be "namespace aware" if the transformation I asked about exists? Shouldn't I simply transform both the DTD and the document, then check validity? For that matter, why should _any_ XML standard be "namespace aware" - couldn't the same apply to them all? I can see how namespace patterns might be useful in XSL, but that's a rather exceptional case. Why wasn't the DTD spec extended to handle global attributes (regardless of namespace issues) instead of sneaking them in the namespace spec? Global attributes make the same amount of sense for non-prefixed attributes within a single XML language. This is the root of the problems here - _what is the benefit gained from placing this in the namespace spec_? Is there one? Is it worth it? Can we revoke it? Again, I could be very wrong here - when it seems to me everyone else is contradicting himself, it is high time for a reality check. Help? Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Thu Feb 4 15:15:07 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:08:34 2004 Subject: Schema Processing (was Re: Fw: Namespaces) In-Reply-To: <36B987C3.D79BEFF1@jclark.com> References: <007f01be5023$1a9d33a0$5402a8c0@oren.capella.co.il> Message-ID: <199902041514.KAA11880@hesketh.net> At 06:42 PM 2/4/99 +0700, James Clark wrote: >> One would of course have >> to pass the DTDs (or other schema files) through the same transformation. > >Why? DTDs are used only at the pre-transformation stage. I would expect >a future XML Schema language to operate purely on the >post-transformation tree. This opens up a new can of worms, one we discussed during the creation of XSchema (now DDML), but potentially an ugly one long term. This paragraph suggests a process like: 1. Process document against DTD. 2. Resolve namespaces. 3. Process document against schema. Which is fine, in some ways - I'd prefer to see schemas just define structures, not content substitution (i.e., entities) - but opens up potentially potent new layers of complexity that make the current mire of well-formed and valid documents seem quite friendly. Is the W3C ever going to take a look at making its specs into neat layers instead of octopuses that sprawl across multiple levels of processing? Namespaces wraps its tentacles around several parts of XML 1.0, and duplication between schemas and DTDs may leave another generation of would-be XML-ers scratching their heads at the odd ways of those sophisticated folks writing the specs. Linking and styling present similar delightful conundrums, and there's a fairly large group of voices (mine included) on the XSL list calling for that spec to be cleanly split into two pieces: transformation and formatting. I'd rather have more small clean pieces that fit together neatly than large chunks that need to wrap around each other to work properly. Simon St.Laurent XML: A Primer / Building XML Applications (March) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From shecter at darmstadt.gmd.de Thu Feb 4 15:30:52 1999 From: shecter at darmstadt.gmd.de (Robb Shecter) Date: Mon Jun 7 17:08:34 2004 Subject: Component Markup Language References: <3.0.6.32.19990204061222.00f1de60@scripting.com> Message-ID: <4EB404D1.E327D32F@darmstadt.gmd.de> Dave Winer wrote: > http://www.bluestone.com/xml/XwingML/ > > Yes, that is very interesting. But if anyone from Bluestone is listening, > it would be great to have a page that shows XML code and a screen shot of > the interface it generates. Well, I just downloaded it and checked it out, and I don't think this could really be used as a way to describe platform-independent UI's. (Can I attach a copy of an XML here w/out violating some copyright? Hmm...maybe a piece of it... :) <JFrame name="MainFrame" title="Bluestone XML Notepad" image="icon.gif" x="20%" y="20%" width="60%" height="60%"> <JMenuBar> <JMenu text="File" mnemonic="F"> <JMenuItem icon="open.gif" text="Open..." mnemonic="O" accelerator="VK_O,CTRL_MASK" actionListener="OpenFile"/> etc... This looks basically like Java code with an XML syntax. It doesn't abstract at the nature of components, or interactions. I guess what it is, is one way to get a scripting functionality in Java; a competitor to projects like the BeanShell. - Robb xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From shecter at darmstadt.gmd.de Thu Feb 4 15:39:42 1999 From: shecter at darmstadt.gmd.de (Robb Shecter) Date: Mon Jun 7 17:08:34 2004 Subject: Component Markup Language References: <010101be5043$b5c24030$9209f880@razzmatazz.math.uic.edu> Message-ID: <4EB406E9.D2D8D084@darmstadt.gmd.de> "Paul R. Brown" wrote: > >Has anyone thought about or worked on an markup language to describe a > >User Interface in a platform independent way? > > It's not a markup language (or a subset of SGML), but python does a > reasonably good job of providing platform-independent UI. > Hi, Yes, that's definitely something else that I'm checking out: Several languages have abstracted ui widgets and interactions in platform independent ways: Java, Smalltalk, Python and tcl/tk come to mind. I can imagine using one of these either directly, or more likely, basing an XML language on a model that one of these languages have developed. For example, a piece of XML that uses the Java-style interaction model could look like: <UI> <BasicWidget id="1" x="100" y="100" height="50" width="50" type="TextEntryField"> <KeyPressListener name="2"> <KeyPressListener name="3"> <KeyPressListener name="4"> </BasicWidget> etc... - Robb xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Feb 4 15:47:45 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:34 2004 Subject: Namespaces does *not* formally introduce "global attributes" In-Reply-To: <002701be5050$1b09deb0$5402a8c0@oren.capella.co.il> References: <002701be5050$1b09deb0$5402a8c0@oren.capella.co.il> Message-ID: <14009.48512.126069.745770@localhost.localdomain> Oren Ben-Kiki writes: > >The namespaces spec introduces the concept of a "global" > >attribute, which doesn't exist in XML 1.0. > > AHA! So the namespaces proposal _DOES_ go beyond simply qualifying > names with URIs! No it doesn't. The comment to which Oren is replying contains a common (but understandable) mistake. The normative part of the Namespaces spec does *not* mention global attributes at all, so formally, they are not part of the namespaces specification. Non-normative appendix A.1 mentions "global attributes" descriptively, as a commonly-observed design pattern in SGML/XML documents. It's up to the Schema WG (and other schema standard creators) to decide whether there is such a thing as formal global attributes and if so, how they should work. The source of the misreading is the unfortunately obfuscatory appendix A.2, which in general has caused an immense amount of unnecessary confusion among implementors -- fortunately, A.2 is also non-normative, so you're free to ignore it (I always have, and I highly recommend doing so). I will recommend dropping A.2 in future drafts. In any case, what does all this have to do with transformation? All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at weblogic.com Thu Feb 4 16:08:50 1999 From: peter at weblogic.com (Peter Seibel) Date: Mon Jun 7 17:08:34 2004 Subject: A weaker XSL? In-Reply-To: <93CB64052F94D211BC5D0010A80013310EB2D8@WWMESS3> Message-ID: <19990204161626993.AAA218@ashbury.weblogic.com@lawton> At 05:17 AM 2/4/99 , Michael.Kay@icl.com wrote: >> Is there a possibility for creating a sub-set of >> XSL that would work on a stream instead of >> requiring a complete document object? > >Funny you should ask that, I've been experimenting over the last few days to >see whether I could build such a thing on top of SAXON. Not actually easy to >do as a pure subset, but I think one can do something that feels quite >XSL-like. So I obviously lack imagination or understanding of all the intricacies of XSL -- what can you express in XSL that you couldn't implement on top of SAX? -Peter -- Peter Seibel Perl/Java/English Hacker peter@weblogic.com Is Windows98 Y2K compliant? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nwoh at software-ag.de Thu Feb 4 16:27:57 1999 From: nwoh at software-ag.de (Nigel Hutchison) Date: Mon Jun 7 17:08:34 2004 Subject: Component Markup Language In-Reply-To: <4EB404D1.E327D32F@darmstadt.gmd.de> References: <3.0.6.32.19990204061222.00f1de60@scripting.com> Message-ID: <3.0.6.32.19990204172738.00bfbec0@daemsg01> At 04:29 PM 11/4/11 +0100, Robb Shecter wrote: >Dave Winer wrote: > >> http://www.bluestone.com/xml/XwingML/ >> >> Yes, that is very interesting. But if anyone from Bluestone is listening, >> it would be great to have a page that shows XML code and a screen shot of >> the interface it generates. > >Well, I just downloaded it and checked it out, and I don't think this could really be used as >a way to describe platform-independent UI's. (Can I attach a copy of an XML here w/out >violating some copyright? Hmm...maybe a piece of it... :) > > <JFrame name="MainFrame" title="Bluestone XML Notepad" image="icon.gif" x="20%" y="20%" >width="60%" height="60%"> > <JMenuBar> > <JMenu text="File" mnemonic="F"> > <JMenuItem icon="open.gif" text="Open..." mnemonic="O" >accelerator="VK_O,CTRL_MASK" actionListener="OpenFile"/> > >etc... > >This looks basically like Java code with an XML syntax. It doesn't abstract at the nature of >components, or interactions. I guess what it is, is one way to get a scripting functionality >in Java; a competitor to projects like the BeanShell. If there was a set of C++ classes that interpreted it and built up a GUI with the same look and feel, that might be useful. Then a server could project its own GUI api and leave it to the client side to choose C++, Java etc to build the interface at run time. I suspect that in most cases the bandwidth required to shift the XML GUI description would be quite a bit less than the compiled java classes. If this is so than "Even Cooler" would be XML GUI interpreter built into a Internet Browser. But am I running too far with this one? Nigel Hutchison Nigel W. O. Hutchison Technical Consultant Software AG Germany mailto:nwoh@software-ag.de Tel +49 (0)6151 92 1207 * xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Thu Feb 4 16:39:14 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:08:34 2004 Subject: Component Markup Language Message-ID: <5F052F2A01FBD11184F00008C7A4A80001136AF3@eukbant101.ericsson.se> > -----Original Message----- > From: Nigel Hutchison [SMTP:nwoh@software-ag.de] > [snip] > Then a server could project its own GUI api and leave it to the client > side > to choose C++, Java etc to build the interface at run time. I suspect > that > in most cases the bandwidth required to shift the XML GUI description > would > be quite a bit less than the > compiled java classes. If this is so than "Even Cooler" would be XML GUI > interpreter built into a Internet Browser. > > But am I running too far with this one? > Not at all - that's exactly the aim of XUL, part of the Mozilla project. I suspect that once mozilla is stable and released we are going to see some really awsome GUI's built using XUL. As a proof of concept, the UI for mozilla is now built using XUL. Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From CBenedet at Bluestone.com Thu Feb 4 16:45:02 1999 From: CBenedet at Bluestone.com (Benedetto, Christopher) Date: Mon Jun 7 17:08:34 2004 Subject: Component Markup Language Message-ID: <9A4DF69E3C5ED211B86400A0C9D1776084720D@thor.operations.bluestone.com> Dave - Thanks for the feedback on XwingML. We are in the process of posting screen grabs to the XwingML website located at (http://www.bluestone.com/xml/XwingML/); these should be available tomorrow. We are also working on more demos that we will post to the XwingML talk list (see below). If you want to see and share more code examples than are included withthe download please register to participate on our XwingML listserv by sending a message to listserv@bluestone.com with the following message in the body of the email "subscribe xwingml-talk". Since XwingML is open-source we would encourage you to make appropriate changes and additions - and submit it back to the xwingml-talk list. In addition to XwingML, we have announced Bluestone's XML-Server, the first generally available Dynamic XML Server and Bluestone Visual-XML, a developer's toolkit (Beta available March, 1999) to help companies build XML-based applications. Information about these (commercially available) tools can be found at (http://www.bluestone.com/xml). ============================================ Christopher Benedetto Product Manager Phone: (609) 727-4600 ext. 3024 Fax: (609) 727-5077 Bluestone Software, Inc. 1000 Briggs Road Mt. Laurel, NJ 08054 mailto:cbenedet@bluestone.com http://www.bluestone.com ============================================ -----Original Message----- From: Dave Winer [mailto:dave@userland.com] Sent: Thursday, February 04, 1999 9:12 AM To: xml-dev@ic.ac.uk Subject: RE: Component Markup Language http://www.bluestone.com/xml/XwingML/ Yes, that is very interesting. But if anyone from Bluestone is listening, it would be great to have a page that shows XML code and a screen shot of the interface it generates. Four or five such examples, visible to someone who doesn't use their tool, would be very instructive and would help their cause immeasurably. Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Thu Feb 4 16:46:23 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:08:35 2004 Subject: Component Markup Language Message-ID: <000f01be505d$1c834c60$c9a8a8c0@thing2> This all makes me wish I were working on Coins instead of MDSAX. I really think what we want is a meta language (in XML, of course!) which governs the mapping between an application document and the GUI components. This is what the spec for coins 4 was all about. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Thu Feb 4 16:53:26 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:08:35 2004 Subject: Namespaces does *not* formally introduce "global attributes" Message-ID: <01BE5066.010F2A90@grappa.ito.tu-darmstadt.de> David Megginson wrote: > > AHA! So the namespaces proposal _DOES_ go beyond simply qualifying > > names with URIs! > > No it doesn't. Mea culpa. David is right. Global attributes are no longer a normative part of the spec. The namespaces spec says the following about non-xmlns attributes: a) They can be prefixed or unprefixed b) The default namespace does not apply to unprefixed attributes c) Applications should use the namespace name, not the prefix, in constructing qualified names d) No tag can contain two attributes whose names are identical or whose names resolve to the same qualified name; this clarifies what is meant by "name" in the Unique Att Spec validity constraint in XML It does not tell us: e) What the difference is between prefixed and unprefixed attribute names (other than the existence of the prefix) -- that is, the semantic difference (if any) between prefixed and unprefixed attributes f) How to process prefixed or unprefixed attributes, except as noted in (c) Two further comments: 1) Tim Bray's statement that unprefixed attributes do not belong to an XML namespace derives from (b). Since there is no prefix to associate them with an XML namespace, and we can't the default XML namespace doesn't apply, there is simply no association. However, as A.2 points out, many applications are likely to use traditional, per-element namespaces for unprefixed attributes. Note that these are a consequence of the Unique Att Spec validity constraint in the XML spec, not anything in the namespaces spec. 2) As far as I can tell, there is nothing in the normative part of the spec that would lead us to conclude the existence of a global attribute partition that is separate from the per-element-type partitions, as is described in A.2. That is, the namespaces spec gives us no reason to believe that ns:a attributes in the following elements are in any way related, any more than the b attributes are related: <foo ns:a="1" b="1"/> <bar ns:a="1" b="1"/> Is this correct? -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From shecter at darmstadt.gmd.de Thu Feb 4 16:56:09 1999 From: shecter at darmstadt.gmd.de (Robb Shecter) Date: Mon Jun 7 17:08:35 2004 Subject: Component Markup Language References: <3.0.6.32.19990204061222.00f1de60@scripting.com> <3.0.6.32.19990204172738.00bfbec0@daemsg01> Message-ID: <4EB4181A.99BB7C0@darmstadt.gmd.de> Nigel Hutchison wrote: > Then a server could project its own GUI api and leave it to the client side > to choose C++, Java etc to build the interface at run time. Yes. > I suspect that > in most cases the bandwidth required to shift the XML GUI description would > be quite a bit less than the > compiled java classes. If this is so than "Even Cooler" would be XML GUI > interpreter built into a Internet Browser. How about this: Have one XSL document per client side scripting language. That is, different XSL documents could implement: Component-XML->DHTML (This particular one would give you your "Even Cooler" idea.) Component-XML->Java BeanShell Component-XML->Smalltalk etc... ...then, a client side browser applies whatever XSL template is best for it, gets the the resulting kind of script it understands, and on the fly generates a UI. - Robb xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Thu Feb 4 16:57:13 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:35 2004 Subject: Component Markup Language Message-ID: <A26F84C9D8EDD111A102006097C4CD0D05495C@SOHOS002> Nigel Hutchison wrote: > If there was a set of C++ classes that interpreted it and > built up a GUI > with the same look and feel, that might be useful. > Then a server could project its own GUI api and leave it to > the client side > to choose C++, Java etc to build the interface at run time. > I suspect that > in most cases the bandwidth required to shift the XML GUI > description would > be quite a bit less than the > compiled java classes. If this is so than "Even Cooler" would > be XML GUI > interpreter built into a Internet Browser. > > But am I running too far with this one? Great idea. Can I suggest we call it HTML? (OK I know I've cracked that one already - it's the end of the week. But really, if you want an XML specification for a user interface, surely HTML 4.0 is the one to choose. Then you could use a 'cool browser' that has an "XML GUI Interpreter" built in - like, well, IE4 and Netscape 4.) Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dave at userland.com Thu Feb 4 17:12:31 1999 From: dave at userland.com (Dave Winer) Date: Mon Jun 7 17:08:35 2004 Subject: Component Markup Language In-Reply-To: <9A4DF69E3C5ED211B86400A0C9D1776084720D@thor.operations.blu estone.com> Message-ID: <3.0.6.32.19990204091358.00f17c10@scripting.com> >>In addition to XwingML, we have announced Bluestone's XML-Server, the first generally available Dynamic XML Server I would like to know what this means. I am concerned that you're blowing right by our product, and that would not be appreciated. How is your XML server different from Frontier 5.1? Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Thu Feb 4 17:20:12 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:08:35 2004 Subject: Component Markup Language Message-ID: <5F052F2A01FBD11184F00008C7A4A80001136AF4@eukbant101.ericsson.se> > -----Original Message----- > From: Mark Birbeck [SMTP:Mark.Birbeck@iedigital.net] > > Great idea. Can I suggest we call it HTML? > > (OK I know I've cracked that one already - it's the end of the week. But > really, if you want an XML specification for a user interface, surely > HTML 4.0 is the one to choose. Then you could use a 'cool browser' that > has an "XML GUI Interpreter" built in - like, well, IE4 and Netscape 4.) > HTML 4 isn't quite up to providing a full GUI, even with the DOM. For example you can't do menu's. You can't do buttons with images on them (I'm not talking about images that are buttons), you can't do tabbed dialogs (well, you can, but it's non-trivial), I'm sure there are other things. Of course you could add these things into HTML 5, but I don't think it's worth going down that road. Matt. -- http://come.to/fastnet Perl on Win32, PerlScript, ASP, Database, XML GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From shecter at darmstadt.gmd.de Thu Feb 4 17:29:16 1999 From: shecter at darmstadt.gmd.de (Robb Shecter) Date: Mon Jun 7 17:08:35 2004 Subject: Component Markup Language References: <A26F84C9D8EDD111A102006097C4CD0D05495C@SOHOS002> Message-ID: <4EB4208B.E26E66ED@darmstadt.gmd.de> Mark Birbeck wrote: > (OK I know I've cracked that one already - it's the end of the week. But > really, if you want an XML specification for a user interface, surely > HTML 4.0 is the one to choose. Then you could use a 'cool browser' that > has an "XML GUI Interpreter" built in - like, well, IE4 and Netscape 4.) Hi, (I found the "HTML" comment very funny, btw. :) I'm thinking of something much more heavy-duty. Like, I want to support a complex user interface that has many complex widgets, with flexible event wiring, that must make accesses to back end data. Something like the UI for a Bloomberg Box, or Microsoft Encarta. This in itself is "easy", and typically done by having a platform-specific UI that retrieves domain objects from a datastore/business logic middle tier. More interesting and challenging would be to make the UI/backend split "higher up". Do all the layout work on the backend, and just send a platform-independent description of the UI over the wire. And -that- is what's led me to look for a Component Markup Language. - Robb xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nwoh at software-ag.de Thu Feb 4 17:31:17 1999 From: nwoh at software-ag.de (Nigel Hutchison) Date: Mon Jun 7 17:08:35 2004 Subject: Component Markup Language In-Reply-To: <A26F84C9D8EDD111A102006097C4CD0D05495C@SOHOS002> Message-ID: <3.0.6.32.19990204182918.00929ec0@daemsg01> At 04:47 PM 2/4/99 -0000, Mark Birbeck wrote: >Nigel Hutchison wrote: >> If there was a set of C++ classes that interpreted it and >> built up a GUI >> with the same look and feel, that might be useful. >> Then a server could project its own GUI api and leave it to >> the client side >> to choose C++, Java etc to build the interface at run time. >> I suspect that >> in most cases the bandwidth required to shift the XML GUI >> description would >> be quite a bit less than the >> compiled java classes. If this is so than "Even Cooler" would >> be XML GUI >> interpreter built into a Internet Browser. >> >> But am I running too far with this one? > >Great idea. Can I suggest we call it HTML? I knew you would say that. :-) > >(OK I know I've cracked that one already - it's the end of the week. But >really, if you want an XML specification for a user interface, surely >HTML 4.0 is the one to choose. Then you could use a 'cool browser' that >has an "XML GUI Interpreter" built in - like, well, IE4 and Netscape 4.) > The trouble is that HTML 4.0 + Dynamic HTML + IE4 + Netscape (with back buttons etc) still doesn't quite cut the mustard as far as GUI interfaces are concerned. You can get some good effects if you put a lot of work in but is not very portable or robust. I would also like my GUI interface to have reasonable session control as well. Current browsers are very dodgy when it comes to sessions. I do get SSL for free. But if I use Java applets I get the other extreme. I have to code the GUI in Java, and upload the GUI code, and the security and encryption classes etc and hope it will work on Netscape and Microsoft browsers. regards Nigel Hutchison Nigel W. O. Hutchison Technical Consultant Software AG Germany mailto:nwoh@software-ag.de Tel +49 (0)6151 92 1207 * xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Feb 4 17:35:52 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:35 2004 Subject: Namespaces does *not* formally introduce "global attributes" In-Reply-To: <01BE5066.010F2A90@grappa.ito.tu-darmstadt.de> References: <01BE5066.010F2A90@grappa.ito.tu-darmstadt.de> Message-ID: <14009.55296.506692.837082@localhost.localdomain> <summary> <em>Please</em>, everyone, stop worrying about <socalled>partitions</socalled>! </summary> Ronald Bourret writes: > 1) Tim Bray's statement that unprefixed attributes do not belong to > an XML namespace derives from (b). Since there is no prefix to > associate them with an XML namespace, and we can't the default XML > namespace doesn't apply, there is simply no association. XML attributes have an automatic association with the element on which they appear, so you have that to fall back on (is that what you were getting at in the text I snipped out?). > 2) As far as I can tell, there is nothing in the normative part of > the spec that would lead us to conclude the existence of a global > attribute partition that is separate from the per-element-type > partitions, as is described in A.2. The reason that partitions are not mentioned in the normative part of the spec is that they haven't really been thought out yet -- that's really part of the schema work. Think of appendix A.2 as "here are some of our preliminary speculations to show you what we were talking about when we designed namespaces". The Namespaces spec just lets you construct globally-unique names; it does not tell you anything about how to interpret those names. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nikita.ogievetsky at csfb.com Thu Feb 4 17:45:12 1999 From: nikita.ogievetsky at csfb.com (Ogievetsky, Nikita) Date: Mon Jun 7 17:08:35 2004 Subject: Namespace clashes? Message-ID: <9C998CDFE027D211B61300A0C9CF9AB44246E7@SNYC11309> I thought of namespace prefixes as aliases. If parser is capable to resolve alias <=> alias definition than it should not matter which alias one uses to define the same namespace? Nikita Ronald Bourret wrote: >> Imagine A, B, C, D and E and all users of some XML data. A and B exchange >> data and agree to use A's namespace and that "<an:t>title</an:t>" is a >> book title (among other things). >> >> Elsewhere, C and D also exchange data and agree to use D's namespace and >> that "<dn:t>title</dn:t>" is a book title. >> >> 1. E comes along and wants to create data and exchange data with A, B, C >> and D. What does E use? If E creates a new NS "<en:t>title</en:t>" then >> A, B, C and D have to update there procedures to cater for this? >E should use <an:t> to exchange data with A and B and <dn:t> to exchange >data with C and D. If E creates their own DTD, then everybody who wants to >exchange data with E needs to update their software, which is unlikely to >make E very popular. A much better solution is to get everybody (A-E) to >agree on a single set of tags in the first place. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From shecter at darmstadt.gmd.de Thu Feb 4 18:01:21 1999 From: shecter at darmstadt.gmd.de (Robb Shecter) Date: Mon Jun 7 17:08:35 2004 Subject: CORBA's not boring yet. / XML in an OS? Message-ID: <4EB4281B.662222A5@darmstadt.gmd.de> Hi, A little while ago on this list someone said they hope that XML wasn't going to have the fate of CORBA ... a standard that people asked too much of, and that is now relegated to the world of boring, overhyped and underused technologies. Well, I was reminded about it when I read this month's Linux Journal - it describes how -both- up and coming desktop environments are basing major parts of their architectures on CORBA. KDE's so cool it makes me want to learn C++. :) Prediction: In 3 years, half the people on this list will be using a corba-based desktop environment. Anyhow, this naturally makes me wonder - could XML and related ideas like XSL have a place in an operating system? Where would they fit in? KDE and Gnome could be great playgrounds for trying something like this out. - Robb xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Thu Feb 4 18:05:00 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:35 2004 Subject: A weaker XSL? References: <19990204162131Z366980-4903+14@calum.csclub.uwaterloo.ca> Message-ID: <36B9D8AA.7B2CFB4B@prescod.net> Peter Seibel wrote: > > At 05:17 AM 2/4/99 , Michael.Kay@icl.com wrote: > >> Is there a possibility for creating a sub-set of > >> XSL that would work on a stream instead of > >> requiring a complete document object? > > > >Funny you should ask that, I've been experimenting over the last few days to > >see whether I could build such a thing on top of SAXON. Not actually easy to > >do as a pure subset, but I think one can do something that feels quite > >XSL-like. > > So I obviously lack imagination or understanding of all the intricacies of > XSL -- what can you express in XSL that you couldn't implement on top of SAX? How do I build a table of contents that is output BEFORE the document without reading the whole input before I start to output? -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Feb 4 18:34:34 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:35 2004 Subject: Restricted Namespaces for XML Message-ID: <3.0.32.19990204095507.00bb99c0@pop.intergate.bc.ca> At 12:16 AM 2/4/99 -0800, Don Park wrote: >If I read James Clark's message correctly, I believe he is in favor of >chopping out some features out of the "Namespaces for XML" as well as some >sections so I believe we might be on a good track. > >Another crazy idea I had was to remove the use of URI while keeping the >basic style. Urgh. I think that loses 95% of the benefit. My idea had always been that with namespaces, I'd be able to write a little SAX or DOM code that could reliably pick "my" elements and attributes out of the middle of anybody's document, anywhere, and do "my" stuff on them; simply by keying off the namespace URI. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Feb 4 18:34:33 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:35 2004 Subject: Component Markup Language Message-ID: <3.0.32.19990204102409.00b73e10@pop.intergate.bc.ca> At 09:13 AM 2/4/99 -0800, Dave Winer wrote: >>>In addition to XwingML, we have announced Bluestone's XML-Server, the first >generally available Dynamic XML Server > >I would like to know what this means. It means nothing, like most windy marketing BS. There have now been half a dozen announcements of "The First XML Server/Repository/" or permutations of that name. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Feb 4 18:57:42 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:35 2004 Subject: A weaker XSL? References: <199902041016.EAA06857@trinkpad.valleytel.net> Message-ID: <36B9ECED.38BE05E9@infinet.com> Nathan Kurz wrote: > Also, while the entire stream has to have been read, does it have to > have already been processed? The way I was interpretting the spec, > the DOM model didn't exclude a lazy processing method. So long as an > implementation provides a compliant interface, can't it do anything it > wants with the data, even so far as to put off processing information > until it is requested? Actually this is legal, just that when over you iterate over nodes, a Node needs to return the expected data. If you are reading things from a stream, then you obviously cannot just make random accesses throughout the stream to lazily evaluate your data because streams are inherently sequential. For a DOM document that is merely an interface to a DBMS, this lazy data model approach you suggest may indeed be the optimal solution for that particular implementation. > I had hoped for an extremely lazy DOM implementation that would > maintain information about all but the root level nodes in a 'flat' > unprocessed state a request for that information is made. For many > cases (well, at least the ones I'm envisioning) such an implemention > would be much more efficient than an entirely pre-processed one. Is > this sort of implemention just right out of the question? No. Sorry if I confused you. > > As for the second statement (regarding XSL), could these constraints > be more explicitly laid out? While I can see that arbitrary XSL might > require a fully constructed tree, couldn't one come up with many cases > where a partially constructed tree would be sufficient? For example, > what if your style sheet had only the following template: > > <xsl:template match="/match"> > Found a match! > </xsl:template> > > Would one still have to fully construct your tree ahead of time? > > Hoping I'm not too far off base but half-expecting that I must be, > > nathan kurz > nate@valleytel.net > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Thu Feb 4 19:01:21 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:08:35 2004 Subject: Fw: Namespaces References: <007f01be5023$1a9d33a0$5402a8c0@oren.capella.co.il> Message-ID: <36B9EE58.E4CD419F@locke.ccil.org> Oren Ben-Kiki scripsit: > Is it possible to recast the namespace recommendation as a transformation > from an XML tree with 'xmlns' attributes and '...:' prefixes into a tree > which doesn't have them, but with modified element and attribute names, such > that the semantics of the resulting tree under the rest of the relevant > recommendations (ignoring namespaces) is preserved? Of course. Furthermore, it is possible in one pass. Check my namespace filter at http://www.ccil.org/~cowan/XML/NamespaceFilter.java . > Note that this may require defining a textual form for the transformed tree > (using "...^...", or "{..}..", or whatever). And so I do. > If so, then we'll have a clear definition of just how to add a namespace > processor on top of a normal XML processor. Proof by example. > These two statements seem contradictory to me - if "all" the namespaces do > is prefix the names with a URI [...]. More accurately: it shows how to map names that contain a colon to URI-prefixed form; some colon-free element names are also given URI-prefixed equivalents, but some are not; attribute names without colons are not given URI-prefixed forms. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From CBenedet at Bluestone.com Thu Feb 4 19:04:16 1999 From: CBenedet at Bluestone.com (Benedetto, Christopher) Date: Mon Jun 7 17:08:35 2004 Subject: Component Markup Language Message-ID: <9A4DF69E3C5ED211B86400A0C9D17760847220@thor.operations.bluestone.com> The Bluestone XML-Server can dynamically generate and receive XML documents. It can also enable legacy integration for the enterprise very cost-effectively. The reason why we make the distinction of the dynamic XML server is that it's not like other products out there. There are a lot of XML servers that are focused on being repositories and content managers. Dynamic XML servers generate XML documents that are very short-lived - they get sent to a browser for display, or to another application to be used, or to another XML server to do enterprise application integration. Specifically, the Bluestone XML-Server connects to any DB via JDBC, supports RMI, IIOP, SSL and HTTP protocols, can be exposed as an EJB, runs in any JVM, and in conjunction with Bluestone's Sapphire/Web application server, can dynamically scale to 100 million interactions and integrate with nearly a dozen back-end business data objects (SAP, Peoplesoft, CICS, MQSeries, etc). -----Original Message----- From: Tim Bray [mailto:tbray@textuality.com] Sent: Thursday, February 04, 1999 1:34 PM To: Dave Winer; xml-dev@ic.ac.uk Subject: RE: Component Markup Language At 09:13 AM 2/4/99 -0800, Dave Winer wrote: >>>In addition to XwingML, we have announced Bluestone's XML-Server, the first >generally available Dynamic XML Server > >I would like to know what this means. It means nothing, like most windy marketing BS. There have now been half a dozen announcements of "The First XML Server/Repository/" or permutations of that name. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Thu Feb 4 19:10:31 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:36 2004 Subject: A weaker XSL? References: <19990204162131Z366980-4903+14@calum.csclub.uwaterloo.ca> <36B9D8AA.7B2CFB4B@prescod.net> Message-ID: <36B9EFD2.62A42FA@manhattanproject.com> Paul Prescod wrote: > Peter Seibel wrote: > > >> Is there a possibility for creating a sub-set of > > >> XSL that would work on a stream instead of > > >> requiring a complete document object? > > >Funny you should ask that, I've been experimenting over the last few days to > > >see whether I could build such a thing on top of SAXON. Not actually easy to > > >do as a pure subset, but I think one can do something that feels quite > > >XSL-like. > > So I obviously lack imagination or understanding of all the intricacies of > > XSL -- what can you express in XSL that you couldn't implement on top of SAX? > > How do I build a table of contents that is output BEFORE the document > without reading the whole input before I start to output? > Exactly :) With a weak-XSL you coudn't build the table of contents BEFORE the entire stream is read, the same with sorting, you can't do it before the entire stream is read. However, it can do other things, like turning a <catalog> into a <table>, building a index or trailing table of contents, etc. Thus it still has use for translating XML into HTML and other output forms, while not requiring the memory and processing overhead. Also, if you really needed the stream sorted or a table of contents first, there is nothing preventing the producer of the stream (a database?) from doing this. Don Park wrote: > > The DOM spec does NOT require the entire stream to be read before the > document object is returned. Some of the DOM implementations available > today does indeed process the entire stream before returning but that is a > quality of implementation issue. Really? DOM specification: > > interface Node { > readonly attribute NodeList childNodes; > readonly attribute Node lastChild; > }; > > interface NodeList { > Node item(in unsigned long index); > readonly attribute unsigned long length; > }; It seems that these two in combination would make a recursion that would require the entire stream to be read before the object would be useable. I guess you _could_ make these attributes smart methods that would BLOCK if the information required to satisfy the request was not yet available. But then, anyone using the DOM would have to use these attributes understanding that any one of them may block for a minute or more while the stream "catches up"... I'm not sure that this is valueable, especially when SAX provides such a nice stream-oriented interface to XML documents. :) Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ti64877 at imcnam.sbi.com Thu Feb 4 19:12:19 1999 From: ti64877 at imcnam.sbi.com (Ingargiola, Tito) Date: Mon Jun 7 17:08:36 2004 Subject: CORBA's not boring yet. / XML in an OS? Message-ID: <3994C79D0211D211A99F00805FE6DEE249BF77@exchny15.corp.smb.com> Hi, I'm certainly not ready to condemn CORBA to the realm of "boring, overhyped and underused technologies" (though I do have my moments...), and I don't know too much about GNOME and KDE specifically, but I certainly think that there is a great deal of potential synergy between CORBA and XML/DOM in the "application server" space. I believe that many of the people looking at XML (and the alphabet soup of related technologies...) are looking to implement systems which: o deal with databases of various sorts o present UIs via the web o provide programmatic/non-web interfaces to external systems and XML (and friends) promise to help in each of these areas. Database vendors are (talking about) providing an XML layer on their DBMSs, web browser and server vendors are driving many of these standards efforts, and XML's applicability to workflow-type problems is possibly its key selling attraction. This, to me, all points to the fact that XML is going to become a key element in the "application server" space. I don't think there are any surprises here. The fact that the DOM provides IDL mappings (albeit barely workable ones :-/), *combined* with the fact that application servers have obvious scalability problems without some means of distribution (e.g., CORBA, DCOM, &tc) tells me that this space is the key area in which we'll see CORBA and XML (&tc) playing together. Another area (where these families of technologies meet) which has generated a lot of attention is as one of the mechanisms used to provide a "metadata" description of a software component (e.g., a javaBean or a (D)COM object), but this seems to me rather less generally interesting for people looking to implement systems in the short term with characteristics like those described above. (For component tools vendors, however ...) What is more interesting (to me, at least ;-) is trying to envision what our XML-friendly, distributed application server is going to look like. What kinds of services do we need to provide in our distributed environment to best leverage these technologies? I certainly imagine that we'll have some means of querying a DB and receiving a Document, DocumentFragment, or (shudder) NodeList as a result of this query. Further, we'll be able to "push" this object up onto our distribution mechanisms "bus" (to use a CORBAism). I also imagine that we'll have a set of services for manipulating these objects available on our bus: transformations (for sorting, searching, retargeting to a different DOM model (e.g., HTML), &tc), formatting (think the formatting side of XSL), reference resolution (e.g., get the appropriate stylesheet, DTD or XLink-ed element in this document), and undoubtedly others. Our application server will clearly have one or more gateways from/to (e.g., through) a Web Server, and may well provide tools to help automate the development of non web-oriented interfaces to the system. Naturally, there are a lot of question marks remaining here (performance looms large in my mind), but I feel pretty safe saying this is one of the key places we're going with XML, and CORBA and other distribution technologies share a pretty prominent role in this space. I'd be interested to hear how others envision this hypothetical (for now) application server. Regards, Tito. > ---------- > Subject: CORBA's not boring yet. / XML in an OS? > > Hi, > > A little while ago on this list someone said they hope that XML wasn't > going to have the fate of CORBA ... a standard that people asked too > much of, and that is now relegated to the world of boring, overhyped and > underused technologies. > > Well, I was reminded about it when I read this month's Linux Journal - > it describes how -both- up and coming desktop environments are basing > major parts of their architectures on CORBA. KDE's so cool it makes me > want to learn C++. :) > > Prediction: In 3 years, half the people on this list will be using a > corba-based desktop environment. > > Anyhow, this naturally makes me wonder - could XML and related ideas > like XSL have a place in an operating system? Where would they fit in? > KDE and Gnome could be great playgrounds for trying something like this > out. > > - Robb > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dave at userland.com Thu Feb 4 19:16:37 1999 From: dave at userland.com (Dave Winer) Date: Mon Jun 7 17:08:36 2004 Subject: Component Markup Language In-Reply-To: <9A4DF69E3C5ED211B86400A0C9D17760847220@thor.operations.blu estone.com> Message-ID: <3.0.6.32.19990204111937.00f439b0@scripting.com> >Dynamic XML servers generate XML documents that are very short-lived - they get sent to a browser for display, or to another application to be used, or to another XML server to do enterprise application integration. After Tim Bray's response, I'm not sure much more needs to be said, other than we've been doing that, starting in late 1997. We've not been shy about our accomplishments, a simple search of the XML-DEV list or AltaVista would have revealed that you were not first, not by a long shot. One other thing, XML, in my view, is about compatibility, so being first is worse than meaningless, it says you have nothing to offer in the way of compatibility. Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Thu Feb 4 19:17:44 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:08:36 2004 Subject: MS patents style sheets Message-ID: <199902041916.OAA17422@hesketh.net> I just heard about this on another list; info came from the Seybold Report. Elliotte Rusty Harold's Cafe Con Leche (http://metalab.unc.edu/xml/) picked it up, and has a pointer to the patent: http://www.patents.ibm.com/details?pn=US05860073__&language=en Interesting times, interesting times... Simon St.Laurent XML: A Primer / Building XML Applications (March) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Feb 4 19:20:28 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:36 2004 Subject: Restricted Namespaces for XML References: <01BE5040.5CB976A0@grappa.ito.tu-darmstadt.de> Message-ID: <36B9F233.8B193A44@infinet.com> Ronald Bourret wrote: > Don Park wrote: > > > Such a spec might dictate that all namespace declarations be at the root > > element (XML fragments are problematic but...). This restriction has the > > side effect of not allowing duplicate prefixes. > > The major benefit of this proposal is that it reduces the number of checks > for xmlns attributes. This savings is minimal in small documents or > documents with few attributes, but it would be interesting to know how much > xmlns attribute processing costs in a large, attribute-intensive document. > > I think readability is a wash, as you can't do anything more with this than > you can with the current proposal and you lose the ability to have multiple > default namespaces, which are useful in documents that have long sections > alternating between two or more namespaces. > > I think the biggest problem is, as James Clark noted elsewhere, > complication of fragmentation. Since I believe fragments to be a big part > of the future, I don't like anything that will make them harder. > > That said, if anybody had some real numbers about what xmlns attribute > processing costs in the worst case and this turned out to be significant, > it might be useful to have a PI that tells the namespace processor whether > it needs to look beyond the root element. At the parser level, things are not very expensive if you check each attribute name you parse to see if it is of the string "xmlns" or else starts with the string "xmlns:". However, directly on top of SAX (or done as a filter on top of an XML Parser) you need to in effect for every element: Search the entire attribute list for attributes with the name "xmlns" or else something that starts with "xmlns:". You in effect need to do something like: public void startElement(String name, AttributeList attributes) throws SAXException { int length = attributes.getLength(); String name; for (int i = 0; i < length; i++) { attributeName = attributes.getName(i); if (attributeName.equals("xmlns")) { // Do default namespace processing } else if (attributeName.startsWith("xmlns:")) { // Do namespace processing } } } If you have documents which have very few attributes in the elements, then this is not too expensive. If you have documents (like HTML literal result elements in XSL) then things can get pretty harry. If SAX were to make a simple requirement that all strings that represent symbols (like names) were to be interned then things would be a lot cheaper. The same can be said of the DOM as well. Even though there is a relatively small cost to interning all Names in the DOM, being able to test for identity instead of equality for operations like namespace processing are not as big a performance problem anymore (plus it can help out in performance areas for a lot of applications in the general case). Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Feb 4 19:22:08 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:36 2004 Subject: Fw: Namespaces Message-ID: <3.0.32.19990204111758.00b71980@pop.intergate.bc.ca> At 02:00 PM 2/4/99 -0500, John Cowan wrote: >> These two statements seem contradictory to me - if "all" the namespaces do >> is prefix the names with a URI [...]. > >More accurately: it shows how to map names that contain a colon >to URI-prefixed form; some colon-free element names are also >given URI-prefixed equivalents, but some are not; attribute names >without colons are not given URI-prefixed forms. Thank you, John. Will those full of namespace angst please study John's paragraph closely; some pedants might want to say "qualified" rather than "prefixed", but prefix will do. I guess I'm too close to the problem now to be able to explain how little the namespace spec does. Or at least be believed when I do. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Thu Feb 4 19:30:32 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:36 2004 Subject: A weaker XSL? References: <199902041016.EAA06857@trinkpad.valleytel.net> <36B9ECED.38BE05E9@infinet.com> Message-ID: <36B9F44D.69B65B4@manhattanproject.com> Tyler Baker wrote: > > Nathan Kurz wrote: > > > Also, while the entire stream has to have been read, does it have to > > have already been processed? The way I was interpretting the spec, > > the DOM model didn't exclude a lazy processing method. So long as an > > implementation provides a compliant interface, can't it do anything it > > wants with the data, even so far as to put off processing information > > until it is requested? > > Actually this is legal, just that when over you iterate over nodes, > a Node needs to return the expected data. > If you are reading things from a stream, then you obviously cannot > just make random accesses throughout the stream to lazily evaluate > your data because streams are inherently sequential. Yep. Perhaps the weak-XSL could be based upon SAX instead, then you won't be suprized. In this case, the XSL processor would contain both a DOM and SAX implementation. If the XSL sheet was "weak", then the processor could implement it's processing from the SAX output. Otherwise, if it is the "strong" variant, with sorting and table of contents, etc, then it would use the DOM implementation. Also, if the processor was built upon a memory image, then SAX becomes an Iterator over the DOM object. If the processor was built upon a stream input, then the DOM object is constructed from the SAX output. This seems like it would be a really nice ballence. > For a DOM document that is merely an interface to a DBMS, this lazy > data model approach you suggest may indeed be the optimal solution > for that particular implementation. Perhaps. However, many relational databases use a "stream" based approach internally, expecially in cases with large merge/joins. The result set is definately returned in a stream, having a bunch of small queries would bring performance to a crawl since set operations could not be used effectively. This behavior definately depends upon the query and how it is optimized. Thus, you may find that a SAX-like stream based interface on top of a relational database may perform much better than an DOM-like object based interface! Perhaps a hybrid object/stream solution would work the best when a relational database is a primary data source... :) Clark Evans xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Thu Feb 4 19:39:08 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:08:36 2004 Subject: Fw: Namespaces References: <002701be5050$1b09deb0$5402a8c0@oren.capella.co.il> Message-ID: <36B9F6FE.124BCDE8@locke.ccil.org> Oren Ben-Kiki wrote: > Why wasn't the DTD spec extended to handle global attributes (regardless of > namespace issues) instead of sneaking them in the namespace spec? Global > attributes make the same amount of sense for non-prefixed attributes within > a single XML language. This is the root of the problems here - _what is the > benefit gained from placing this in the namespace spec_? Is there one? Is it > worth it? Can we revoke it? I think that some of your problems come from misunderstanding the term "global attribute": unfortunately, the examples in REC-xml-names don't help. A "global attribute" is *not* an attribute that can be placed on every element in a document, or at least it need not be. A global attribute is simply one which has by intention the same meaning in all elements in which it appears. To reuse my earlier example, given an appropriate binding for the "iso4217" prefix, one might have a global attribute "iso4217:currency" which applies to only the PRICE element in a supply catalog. The "iso4217" prefix signals the application that it can use a *global* understanding of currency --- namely, that the value of this attribute is an ISO 4217 currency code (USD, GBP, EUR, etc.). The example of html:class is unfortunately "global" in both the intended sense (its meaning is independent of the element on which it appears) and the unintended sense (it may appear on any element). So it is a bad example for explaining global attributes. >From a DTD viewpoint, global attributes are just ordinary attributes, valid only where an ATTLIST declaration allows them. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Feb 4 19:52:21 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:36 2004 Subject: CORBA's not boring yet. / XML in an OS? References: <4EB4281B.662222A5@darmstadt.gmd.de> Message-ID: <36B9FA0C.11FBFBC@infinet.com> Robb Shecter wrote: > Hi, > > A little while ago on this list someone said they hope that XML wasn't > going to have the fate of CORBA ... a standard that people asked too > much of, and that is now relegated to the world of boring, overhyped and > underused technologies. That was me... > Well, I was reminded about it when I read this month's Linux Journal - > it describes how -both- up and coming desktop environments are basing > major parts of their architectures on CORBA. KDE's so cool it makes me > want to learn C++. :) I think this may end up being a major mistake. IIOP is a nice protocol, but other than that, CORBA is really only useful for integrating client/servers of differing operating environments and hardware. > Prediction: In 3 years, half the people on this list will be using a > corba-based desktop environment. Not likely. My biggest problem with CORBA was that it was too huge for the client and consumed too many resources. Maybe it was just the Visigenic implementation I was using at the time, but plain and simple CORBA is not lightweight. CORBA is one of those things which I feel could be made better by stripping a lot of the rarely used stuff out. This is not a CORBA list so I will pretty much end this here... > Anyhow, this naturally makes me wonder - could XML and related ideas > like XSL have a place in an operating system? Where would they fit in? > KDE and Gnome could be great playgrounds for trying something like this > out. They already do if you consider Internet Explorer a fundamental part of the Windows Operating System (-: Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Thu Feb 4 20:21:45 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:36 2004 Subject: Component Markup Language Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054960@SOHOS002> Robb Shecter wrote: > How about this: Have one XSL document per client side > scripting language. That is, different > XSL documents could implement: > > Component-XML->DHTML (This particular one would > give you your "Even Cooler" idea.) > > Component-XML->Java BeanShell > Component-XML->Smalltalk > etc... > > ...then, a client side browser applies whatever XSL template > is best for it, gets the the > resulting kind of script it understands, and on the fly > generates a UI. We do something similar for browser types, by generating all XSL documents through an ASP page. In our case, we detect the browser type, and change the rules in the stylesheet dynamically. Your scenario would benefit too, because you wouldn't need to send loads of sylesheets to the client - just one, but one that has been dynamically created. (I don't know why I'm responding because I still disagree with whole idea of a component XML! By the time anyone has come up with a useful standard that *does* cope with the minor inefficiencies that have been mentioned, HTML 5 will have been along and will have resolved it. If it really is necessary then I would use HTML 4.0 as the base for the component-XML syntax, and then map *that* to Java interfaces, or whatever. In other words, write a Java (or whatever) interface that understands HTML, plus your few extensions.) Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at dns.isogen.com Thu Feb 4 20:24:26 1999 From: eliot at dns.isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:08:36 2004 Subject: Proposal: New Syntax for DSSSL (was Re: MS patents style sheets) In-Reply-To: <199902041916.OAA17422@hesketh.net> Message-ID: <3.0.5.32.19990204142337.00a40100@amati.techno.com> At 02:19 PM 2/4/99 -0500, Simon St.Laurent wrote: >I just heard about this on another list; info came from the Seybold Report. > >Elliotte Rusty Harold's Cafe Con Leche (http://metalab.unc.edu/xml/) picked >it up, and has a pointer to the patent: This sounds like the infamous "multimedia" patent that someone tried to get a few years back (don't remember the details). This thread triggered me to propose something that I'd been tossing around in my head for a while: Why not define a new syntax for DSSSL as a public, open-source activity? In particular, why not define a Python-based expression syntax? Why: 1. DSSSL is very powerful. DSSSL is well thought out. DSSSL is an established and stable international standard. DSSSL is implemented (jade, <www.jclark.com/jade>, and HyBrick <http://www.fsc.fujitsu.com/hybrick/>). 2. DSSSL's expression syntax, which is scheme-based, is a serious barrier to acceptance and use. While the use of scheme makes perfect sense from a "what language best fits list processing", it is a non-starter from a "who can I hire that can write this stuff?" standpoint. If you can program in Java or VB or Perl, you can learn Python in about an hour. Learning the scheme-based language is much more difficult, because it asks you to move from a procedural-based approach to a functional approach. This seems to be a fairly high barrier for a lot of programmers (it was for me). 3. Python combines a fundamental list awareness with a familiar and easy-to-learn syntax. It provides true, easy-to-use (and learn) object orientation. You can do functional or procedural programming as you prefer. >From a programming standpoint, it wouldn't be very hard to implement the DSSSL semantics in Python. I think it would require the following: 1. Development of a rule-firing layer (the part of a DSSSL engine that applies rules to grove nodes) 2. Implementation of at least the core DSSSL-defined functions 3. Generation of flow object trees Of these, the last is probably the hardest, but should be able to re-use existing code (e.g., Jade). A Python-based syntax for DSSSL, once implemented and proven, could be quickly standardized because it would simply be an alternative syntax for an existing standard--no need to define new semantics. The only barrier to doing this is resources. We already have Python-accessible grove constructors for SGML and XML (Jade itself and TechnoTeacher's GroveMinder). In addition, any DOM implementation can be made to emulate a grove, so all the Jave-based DOM's are immediately available. [Note that I don't think using XML syntax for DSSSL expressions is very interesting given that the Python syntax is already well defined, stable, and has public and free parsers.] In any case, having a style language that has no vendor intellectual property encombrances suddenly seems a lot more compelling than perhaps it did yesterday. Cheers, E. -- <Address HyTime=bibloc> W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com </Address> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Thu Feb 4 20:54:02 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:36 2004 Subject: Component Markup Language Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054961@SOHOS002> > One cannot rely on "HTML 5" as a solution to this problem. The next > generation of HTML will not provide any such facilities. Information > about the nature of the next generation HTML can be obtained > by reading > current Working Draft: > http://www.w3.org/TR/1998/WD-html-in-xml-19981205/ Can people please be specific, so I can see if I have lost the plot here: If I was to propose a component-XML DTD that began with a tag called HTML, then others called HEAD and BODY, and it had definitions for text fields, buttons, drop-boxes, images, tables and more, as well as being able to call up external programs and all the rest of it; what ADDITIONAL features is everyone saying they would want to be present in *their* component-XML that are not in 'mine' (and I really think the lack of graphics on a button is insufficient justification for spending the next year writing an entire XHTML!) I'm not saying that the user-interface has to then be a web browser - make it Java if you want. All I'm saying is there is a very good syntax in place for defining user interfaces - why not use it? Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Ed at dega.com Thu Feb 4 21:11:20 1999 From: Ed at dega.com (Ed Howland) Date: Mon Jun 7 17:08:36 2004 Subject: CORBA's not boring yet. / XML in an OS? Message-ID: <30649320C177D111ADEC00A024E9F297169F33@exchange-server.dega.com> This brings up a related question regarding XML and interchanging data between web servers. Is there something like IIOP for this? I mean an overlaying protocol to establish, manage and break bi-direction XML traffic. In my scenario, I want to develop a published specification (read DTD) to provide all the details of accomplishing an order in an e-commerce environment. In this case, business #1 (B1) wants to order material from business #2 (B2). The web server at B1 uses some middle tier logic (Java) to determine how to get to the right B2. B2 has a web site for individual orders but can also take a standard order in XML (again, in my published DTD). But how does it do that? In other words how do I push XML down the server's throat? I know that a form could be used on B2 that had a field that could be filled in with a hlink back to the XML file on B1. Or the URL of B1's XML file could be placed in some CGI parameter. Or some combination of FTP and CGI could be used (in cases where the requirements of B1's server prohibit retrieval of files this way (perhaps because of security.)) But it seems to me that this approach just utilizes normal HTTP, which seems to be ill-designed in the way of bi-directional data flow. Is someone working on an more elborate mechanism for this? Do we need a XIOP? Just rambling.... Ed Ed Howland ed@dega.com http://www.dega.com "As your attorney, I advise you to take some adrenalchrome" -------------- next part -------------- A non-text attachment was scrubbed... Name: Ed Howland (E-mail).vcf Type: application/octet-stream Size: 157 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990204/035fe786/EdHowlandE-mail.obj From oren at capella.co.il Thu Feb 4 22:10:14 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:08:37 2004 Subject: Fw: Namespaces does *not* formally introduce "global attributes" Message-ID: <000e01be508a$67213110$5402a8c0@oren.capella.co.il> >Oren Ben-Kiki writes: > > AHA! So the namespaces proposal _DOES_ go beyond simply qualifying > > names with URIs! David Megginson <david@megginson.com> wrote: >No it doesn't. Well it certainly got a lot of people confused about it. Luckly we are only wasting bandwidth as a result, instead of rain forest products :-) >The comment to which Oren is replying contains a common (but >understandable) mistake. ... > The source of the misreading is the unfortunately obfuscatory appendix >A.2, ... >I will recommend dropping A.2 in future drafts. That would be a relief :-) >In any case, what does all this have to do with transformation? Simple - if the XML namespaces recommendation was defined by an equivalence to a well defined textual transformation, then much confusion would have been avoided. For example, how namespaces interact with the other XML standards - just extend the names first, then apply the other standards (with a very small number of exceptions, such as namespace patterns in XSL). Additionally, implementers would have been able to easily add a namespace processing module on top of their current XML parsers (a SAX namespace expansion filter, for example, is trivial when implemented this way), _without changing the interfaces_. Future implementations might use better interfaces - such as APIs for accessing just the "namespace part" or the "local part" of an expanded name - but the point is every XML application would go on working as it is, without any changes. James has mentioned in his paper the WG has deliberately decided not to go this way. Could you tell us why this decision was made? An obvious consequence of this decision is that implementers were hesitant to implement namespaces this way. I hope this wasn't the reason :-) This happened because first, it wasn't clear that this approach is in fact conformant. Well, we got this out of the way - James says it is and you explained why (ignore the non-normative Appendix A). Second, because competing textual forms were suggested (the '^' notation, the '{}' notation). Since the WG wasn't kind enough to give us a standard notation, why can't we just pick one on our own and go with it? I'm partial to the '^' notation, myself, since it is (i) shorter, (ii) easier to read, (iii) slightly easier to process, and (iv) more in tune with the conventional hierarchy naming schemes (using a separator instead of parenthesis). Actually, has anyone already written a SAX filter which implements namespaces this way? Is anyone interested? Let's settle this once and for all - "proof by implementation" :-) Have fun, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Thu Feb 4 22:26:10 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:08:37 2004 Subject: Namespaces does *not* formally introduce "global attributes" Message-ID: <007501be508c$b728cae0$c9a8a8c0@thing2> >Actually, has anyone already written a SAX filter which implements >namespaces this way? Is anyone interested? Let's settle this once and for >all - "proof by implementation" :-) Well, there's John Cowan's filter: http://www.ccil.org/~cowan/XML/NamespaceFilter.java Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at weblogic.com Thu Feb 4 22:39:12 1999 From: peter at weblogic.com (Peter Seibel) Date: Mon Jun 7 17:08:37 2004 Subject: A weaker XSL? In-Reply-To: <36B9D8AA.7B2CFB4B@prescod.net> References: <19990204162131Z366980-4903+14@calum.csclub.uwaterloo.ca> Message-ID: <19990204224703222.AAA277@ashbury.weblogic.com@lawton> At 09:28 AM 2/4/99 , Paul Prescod wrote: >Peter Seibel wrote: >> >> At 05:17 AM 2/4/99 , Michael.Kay@icl.com wrote: >> >> Is there a possibility for creating a sub-set of >> >> XSL that would work on a stream instead of >> >> requiring a complete document object? >> > >> >Funny you should ask that, I've been experimenting over the last few days to >> >see whether I could build such a thing on top of SAXON. Not actually easy to >> >do as a pure subset, but I think one can do something that feels quite >> >XSL-like. >> >> So I obviously lack imagination or understanding of all the intricacies of >> XSL -- what can you express in XSL that you couldn't implement on top of SAX? > >How do I build a table of contents that is output BEFORE the document >without reading the whole input before I start to output? Buffer the rest of the document. Or make two passes. Presumably your XSL processor can figure out what things need that sort of buffering or multi-pass treatment. Of course I'm building some stuff in memory but it could be a lot less than the whole document as a DOM tree. For example if my document is the Encylopaedia Britanica and I want to output a (strange) document consisting of a list of all the articles followed by the full text of all the articles containing the word pink I only have to buffer the articles with the word pink which is presumably a lot less than the whole encyclopedia. Another way to look at things is to implement your XSL engine so it builds a complete tree of the *output* before it writes anything out. I'm not arguing that you can implement XSL so it uses no heap, just that you could implement it on top of SAX rather than DOM. Or am I still missing something? -Peter -- Peter Seibel Perl/Java/English Hacker peter@weblogic.com Is Windows98 Y2K compliant? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Thu Feb 4 22:57:23 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:37 2004 Subject: A weaker XSL? References: <19990204162131Z366980-4903+14@calum.csclub.uwaterloo.ca> <19990204224703222.AAA277@ashbury.weblogic.com@lawton> Message-ID: <36BA2516.204F50F9@manhattanproject.com> Peter Seibel wrote: > > I'm not arguing that you can implement XSL so it uses no heap, just that > you could implement it on top of SAX rather than DOM. Or am I still missing > something? I think you are echoing my belief. Question: Is XSL defining "style" instructions or "composition" instructions. Things like sorting, re-arranging, table-of-contents generation, etc. are really large processing instructions that are more along the line of *what* to process, rather than *how* the information should be presented. Things like this could be moved into XQL or some other transformation language, leaving XSL a more pure "style" oriented specification. Thus XSL wouldn't be *generating* a table of contents, it would only let you choose if you want to display it, and if it is displayed, how it is displayed, in green ink or red, bold or itallic, Aa1i style or 1.1.1.1 style, etc. By doing this, a weaker XSL would have a much more clearly defined role, would be less subject to "feature creep", and could be implemented on top of SAX instead of assuming (and requiring) a full DOM implelementation. Just $.02 ;) Clark Evans xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Thu Feb 4 23:35:26 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:37 2004 Subject: A weaker XSL? References: <19990204162131Z366980-4903+14@calum.csclub.uwaterloo.ca> <19990204224703222.AAA277@ashbury.weblogic.com@lawton> <36BA2516.204F50F9@manhattanproject.com> Message-ID: <36BA282D.5545637F@prescod.net> Clark Evans wrote: > > Things like sorting, re-arranging, table-of-contents > generation, etc. are really large processing instructions > that are more along the line of *what* to process, rather > than *how* the information should be presented. Things like > this could be moved into XQL or some other transformation > language, leaving XSL a more pure "style" oriented > specification. XSL is the dominant transformation language for XML content. As far as standardization goes, XQL doesn't even exist. > Thus XSL wouldn't be *generating* a table of contents, > it would only let you choose if you want to display it, > and if it is displayed, how it is displayed, in green > ink or red, bold or itallic, Aa1i style or 1.1.1.1 > style, etc. You are describing CSS. The Web community decided that we needed XSL because CSS does NOT do sorting, re-arranging, TOC generation, cross referencing, etc. Your "weaker XSL" already exists and is called CSS. Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Thu Feb 4 23:37:54 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:37 2004 Subject: A weaker XSL? References: <19990204162131Z366980-4903+14@calum.csclub.uwaterloo.ca> <19990204223858Z367814-4906+21@calum.csclub.uwaterloo.ca> Message-ID: <36BA2770.9B7A1BFF@prescod.net> Peter Seibel wrote: > > Buffer the rest of the document. Or make two passes. You would have to make n-passes where n depends on the XSL stylesheet. As an existence proof of a non-tree implementation, you could avoid implementing a DOM if you re-parsed the document every time you needed to find some query result. But to get any kind of efficiency you need to do some darn sophisticated static analysis. > Presumably your XSL > processor can figure out what things need that sort of buffering or > multi-pass treatment. I have some ideas about how you would do it, but the market does not seem to be willing to pay as much for XSL implementations as it does optimizing C++ compilers so I'm not going to try an implementation. Furthermore you would need to statically analyze the document AND the stylesheet because links will affect what you need to keep in memory. Some people want to allow arbitrary JavaScript code inline so it would be interesting to see you try static analysis if THAT gets added. > Of course I'm building some stuff in memory but it > could be a lot less than the whole document as a DOM tree. For example if > my document is the Encylopaedia Britanica and I want to output a (strange) > document consisting of a list of all the articles followed by the full text > of all the articles containing the word pink I only have to buffer the > articles with the word pink which is presumably a lot less than the whole > encyclopedia. Certainly. But statically analyzing a stylesheet to figure out what to buffer is very tricky. It probably isn't as hard as the halting problem but it is on the level of an optimizing compiler. > Another way to look at things is to implement your XSL engine > so it builds a complete tree of the *output* before it writes anything out. Most of the time XSL's output is larger than the input. > I'm not arguing that you can implement XSL so it uses no heap, just that > you could implement it on top of SAX rather than DOM. Or am I still missing > something? Sure, you can implement it on top of SAX. It will probably run too slowly for most common cases of documents to be useful, but you could do it. Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Feb 4 23:56:16 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:37 2004 Subject: SAX, Java, and Namespaces (was Re: Restricted Namespaces for XML) In-Reply-To: <36B9F233.8B193A44@infinet.com> References: <01BE5040.5CB976A0@grappa.ito.tu-darmstadt.de> <36B9F233.8B193A44@infinet.com> Message-ID: <14010.12971.863429.822758@localhost.localdomain> Tyler Baker writes: > If SAX were to make a simple requirement that all strings that > represent symbols (like names) were to be interned then things > would be a lot cheaper. The same can be said of the DOM as well. The problem is that Java's own intern is so terribly inefficient that no serious parser writer will use it (most of them have their own, custom interns). Even then, you wouldn't get any help with the "xmlns:" prefix matching, which is the costliest part. The most efficient way to do namespace processing is directly in the parser (which has to look at every attribute name anyway), but my own tests have shown that filter layer on top of SAX isn't too bad. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Feb 5 00:18:07 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:37 2004 Subject: Fw: Namespaces does *not* formally introduce "global attributes" In-Reply-To: <000e01be508a$67213110$5402a8c0@oren.capella.co.il> References: <000e01be508a$67213110$5402a8c0@oren.capella.co.il> Message-ID: <14010.14227.857639.249199@localhost.localdomain> Oren Ben-Kiki writes: [snip] > Additionally, implementers would have been able to easily add a > namespace processing module on top of their current XML parsers (a > SAX namespace expansion filter, for example, is trivial when > implemented this way), _without changing the interfaces_. Future > implementations might use better interfaces - such as APIs for > accessing just the "namespace part" or the "local part" of an > expanded name - but the point is every XML application would go on > working as it is, without any changes. [snip] > An obvious consequence of this decision is that implementers were > hesitant to implement namespaces this way. Actually, so far, pretty much everyone seems to have implemented namespaces this way, and it's working like a charm: it's standard in the very popular Perl XML:Parser module (which uses Expat), I wrote my own SAX filter (unreleased) which works this way, John Cowan has released a SAX filter that works this way, and I assume that all RDF and XSL tools built on top of SAX work this way. I rely quite heavily on the namespace processing in XML::Parser for some of my production-grade deliverables to customers. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Feb 5 00:21:24 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:37 2004 Subject: A weaker XSL? In-Reply-To: <19990204224703222.AAA277@ashbury.weblogic.com@lawton> References: <19990204162131Z366980-4903+14@calum.csclub.uwaterloo.ca> <36B9D8AA.7B2CFB4B@prescod.net> <19990204224703222.AAA277@ashbury.weblogic.com@lawton> Message-ID: <14010.14447.929939.629443@localhost.localdomain> Peter Seibel writes: > >How do I build a table of contents that is output BEFORE the > >document without reading the whole input before I start to output? > > Buffer the rest of the document. Or make two passes. In general, however, you cannot know how many passes you need to make, since one query might depend on the results of another. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From spreitze at parc.xerox.com Fri Feb 5 00:21:30 1999 From: spreitze at parc.xerox.com (spreitze@parc.xerox.com) Date: Mon Jun 7 17:08:37 2004 Subject: CORBA's not boring yet. / XML in an OS? In-Reply-To: <30649320C177D111ADEC00A024E9F297169F33@exchange-server.dega.com> Message-ID: <99Feb4.161915pst."834439"@idea.parc.xerox.com> > This brings up a related question regarding XML and interchanging data > between web servers. Is there something like IIOP for this? I mean an > overlaying protocol to establish, manage and break bi-direction XML traffic. > ... > I know that ... > But it seems to me that this approach just utilizes normal HTTP, which seems > to be ill-designed in the way of bi-directional data flow. Is someone > working on an more elborate mechanism for this? Do we need a XIOP? The HTTP-NG effort (http://www.w3.org/Protocols/HTTP-NG/) is working on factoring HTTP-NG into three layers, the lower two of which seem to be what you're asking for. I think you should be asking for something less elaborate, rather than more elaborate, than HTTP/1 --- and the lower two layers of HTTP-NG should give you that. Mike xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pvelikho at cs.ucsd.edu Fri Feb 5 00:21:34 1999 From: pvelikho at cs.ucsd.edu (Pavel Velikhov) Date: Mon Jun 7 17:08:37 2004 Subject: A weaker XSL? References: <19990204162131Z366980-4903+14@calum.csclub.uwaterloo.ca> <19990204224703222.AAA277@ashbury.weblogic.com@lawton> <36BA2516.204F50F9@manhattanproject.com> Message-ID: <36BA32FD.85F2D000@cs.ucsd.edu> Clark Evans wrote: > > Peter Seibel wrote: > > > > I'm not arguing that you can implement XSL so it uses no heap, just that > > you could implement it on top of SAX rather than DOM. Or am I still missing > > something? > > I think you are echoing my belief. > > Question: Is XSL defining "style" instructions > or "composition" instructions. > > Things like sorting, re-arranging, table-of-contents > generation, etc. are really large processing instructions > that are more along the line of *what* to process, rather > than *how* the information should be presented. Things like > this could be moved into XQL or some other transformation > language, leaving XSL a more pure "style" oriented > specification. But XSL is really turning into a query language for XML, not just a stylesheet language. I completely agree that some lightweight language is needed to do the basic presentation-oriented things. Then the implementation could be made efficient, such things as intelligent buffering of the input, lazy evaluation of the output document, optimization are really hard to do for a full blown query language, but are doable for a simple one. > Thus XSL wouldn't be *generating* a table of contents, > it would only let you choose if you want to display it, > and if it is displayed, how it is displayed, in green > ink or red, bold or itallic, Aa1i style or 1.1.1.1 > style, etc. > > By doing this, a weaker XSL would have a much more > clearly defined role, would be less subject to > "feature creep", and could be implemented on top > of SAX instead of assuming (and requiring) a full > DOM implelementation. It will probably be orders of magnitude more efficient too. And it will have a more clear and understandable role. IMHO nobody wants to learn a bunch of highly expressive and pretty much general purpose languages for XML processing. > Just $.02 > ;) Clark Evans Pavel Velikhov xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Fri Feb 5 00:25:04 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:37 2004 Subject: SAX, Java, and Namespaces (was Re: Restricted Namespaces for XML) References: <01BE5040.5CB976A0@grappa.ito.tu-darmstadt.de> <36B9F233.8B193A44@infinet.com> <14010.12971.863429.822758@localhost.localdomain> Message-ID: <36BA39DE.761F3281@infinet.com> David Megginson wrote: > Tyler Baker writes: > > > If SAX were to make a simple requirement that all strings that > > represent symbols (like names) were to be interned then things > > would be a lot cheaper. The same can be said of the DOM as well. > > The problem is that Java's own intern is so terribly inefficient that > no serious parser writer will use it (most of them have their own, > custom interns). As of JDK 1.1.6 things are not so bad and Java 2 is a bit better as interned Strings are under the hood managed using Weak References. It could be made better in the JDK though. I suspect if they made a real effort in the Java 2 JVM they could make string interns at least twice as fast as things currently are. Nevertheless, string interning is a one time cost so lets put that in perspective here. > Even then, you wouldn't get any help with the "xmlns:" prefix > matching, which is the costliest part. The most efficient way to do Very true (ouch, ouch, ouch)... > namespace processing is directly in the parser (which has to look at > every attribute name anyway), but my own tests have shown that filter > layer on top of SAX isn't too bad. Unfortunately as in the case with all XML or XSL benchmarks, the test data can vary enormously. If you have documents that have few elements with attributes (except of course namespace attributes), then things probable will not be so bad. However, if you have lots of attributes in elements, then you need to check every single attribute to see if it starts with "xmlns:" (ouch, ouch, ouch). So I suppose we should no encourage document designers to model data only as character content in elements and only use attributes for ID's and namespaces declarations. For types like a rectangle, I think using attributes makes a lot more sense in the general case, but in the presence of "Namespaces in XML" I would change things from: <Rectangle x="0" y="1" width="59" height="23"> to: <myprefix:Rectangle xmlns:myprefix="YabbaDabbaDoo"> <myprefix:x> 0 </myprefix:x> <myprefix:y> 1 </myprefix:y> <myprefix:width> 59 </myprefix:width> <myprefix:height> 23 </myprefix:height> </myprefix:Rectangle> The really sad thing about this is that there tends to be a feeling among a lot of people that meaningful prefixes do not matter at all. If XML is ever going to be editable by an average internet user for some common tasks, meaningful prefixes do matter. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Feb 5 00:34:44 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:37 2004 Subject: Fw: Namespaces does *not* formally introduce "global attributes" Message-ID: <3.0.32.19990204162458.00b5d650@pop.intergate.bc.ca> At 07:16 PM 2/4/99 -0500, David Megginson wrote: >Actually, so far, pretty much everyone seems to have implemented >namespaces this way, and it's working like a charm: it's standard in >the very popular Perl XML:Parser module (which uses Expat) Seconded. Here's the example I use to teach namespaces to perl programmers; they get it instantly use XML::Parser; $xml = "<z xmlns='http://one.org' xmlns:two='http://two.org'><y1 /><two:y2 /></z>"; $p = new XML::Parser(Namespaces => 1, Handlers => { Start => \&STag }); $p->parse($xml); sub STag { my ($expat, $type) = @_; my $ns = $expat->namespace($type); print "Element type $type is from namespace $ns\n"; } =========output========================================== Element type z is from namespace http://one.org Element type y1 is from namespace http://one.org Element type y2 is from namespace http://two.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dave at userland.com Fri Feb 5 00:34:45 1999 From: dave at userland.com (Dave Winer) Date: Mon Jun 7 17:08:37 2004 Subject: CORBA's not boring yet. / XML in an OS? In-Reply-To: <99Feb4.161915pst."834439"@idea.parc.xerox.com> References: <30649320C177D111ADEC00A024E9F297169F33@exchange-server.dega.com> Message-ID: <3.0.6.32.19990204163649.00f40e20@scripting.com> Gotta check it out! XML-RPC.. What CORBA wants to be. ;-> http://www.xmlrpc.com/ Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Feb 5 00:38:10 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:37 2004 Subject: SAX, Java, and Namespaces (was Re: Restricted Namespaces for XML) In-Reply-To: <36BA39DE.761F3281@infinet.com> References: <01BE5040.5CB976A0@grappa.ito.tu-darmstadt.de> <36B9F233.8B193A44@infinet.com> <14010.12971.863429.822758@localhost.localdomain> <36BA39DE.761F3281@infinet.com> Message-ID: <14010.15436.198339.22480@localhost.localdomain> Tyler Baker writes: > As of JDK 1.1.6 things are not so bad and Java 2 is a bit better as > interned Strings are under the hood managed using Weak References. > It could be made better in the JDK though. I suspect if they made > a real effort in the Java 2 JVM they could make string interns at > least twice as fast as things currently are. Nevertheless, string > interning is a one time cost so lets put that in perspective here. No, unfortunately it's much worse -- the parser has to re-intern *every* element name and attribute name during the parse, so if I have 6,000 attributes named 'foo', I have to invoke intern 6,000 times (just for those attributes). When I was writing AElfred, I got something between a 10-30% speedup (I cannot remember the number) by dropping Java's intern and writing my own. A bad intern is a killer for an XML parser. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Feb 5 00:59:40 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:38 2004 Subject: SAX, Java, and Namespaces (was Re: Restricted Namespaces for XML) Message-ID: <3.0.32.19990204165540.00b57da0@pop.intergate.bc.ca> At 07:35 PM 2/4/99 -0500, David Megginson wrote: >Tyler Baker writes: > > > As of JDK 1.1.6 things are not so bad and Java 2 is a bit better > >No, unfortunately it's much worse -- the parser has to re-intern >*every* element name and attribute name during the parse, so if I have >6,000 attributes named 'foo', I have to invoke intern 6,000 times >(just for those attributes). When I was writing AElfred, I got >something between a 10-30% speedup Lark too; and I just do lookup with binary search, nothing fancy, and the difference was huge. What is Java actually doing I wonder? -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Fri Feb 5 01:39:10 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:38 2004 Subject: SAX, Java, and Namespaces (was Re: Restricted Namespaces for XML) References: <01BE5040.5CB976A0@grappa.ito.tu-darmstadt.de> <36B9F233.8B193A44@infinet.com> <14010.12971.863429.822758@localhost.localdomain> <36BA39DE.761F3281@infinet.com> <14010.15436.198339.22480@localhost.localdomain> Message-ID: <36BA4B75.2034265D@infinet.com> David Megginson wrote: > Tyler Baker writes: > > > As of JDK 1.1.6 things are not so bad and Java 2 is a bit better as > > interned Strings are under the hood managed using Weak References. > > It could be made better in the JDK though. I suspect if they made > > a real effort in the Java 2 JVM they could make string interns at > > least twice as fast as things currently are. Nevertheless, string > > interning is a one time cost so lets put that in perspective here. > > No, unfortunately it's much worse -- the parser has to re-intern > *every* element name and attribute name during the parse, so if I have > 6,000 attributes named 'foo', I have to invoke intern 6,000 times The implementation I have in an XML Parser I wrote only calls intern on strings which are not cached. I do this mostly to reduce calls to new String().intern(). It sure would of been nice if there was a static intern method in the String class that accepted a character array and used the contents to fetch an interned string (or else create a new one if the string is not present in the VM). > (just for those attributes). When I was writing AElfred, I got > something between a 10-30% speedup (I cannot remember the number) by > dropping Java's intern and writing my own. A bad intern is a killer > for an XML parser. Yah, it is really bad if you don't do any sort of String caching as well. Sorry to confuse you here with what I meant by interning strings... Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Fri Feb 5 01:46:56 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:38 2004 Subject: SAX, Java, and Namespaces (was Re: Restricted Namespacesfor XML) References: <3.0.32.19990204165540.00b57da0@pop.intergate.bc.ca> Message-ID: <36BA4D4E.1E8B1A10@infinet.com> Tim Bray wrote: > At 07:35 PM 2/4/99 -0500, David Megginson wrote: > >Tyler Baker writes: > > > > > As of JDK 1.1.6 things are not so bad and Java 2 is a bit better > > > >No, unfortunately it's much worse -- the parser has to re-intern > >*every* element name and attribute name during the parse, so if I have > >6,000 attributes named 'foo', I have to invoke intern 6,000 times > >(just for those attributes). When I was writing AElfred, I got > >something between a 10-30% speedup > > Lark too; and I just do lookup with binary search, nothing fancy, > and the difference was huge. What is Java actually doing I wonder? -T. As I said before things have improved. intern() is now native so there is really no excuse that I can think of why it should still be slow (it is not as slow as it used to be but calling it has roughly half the cost of calling new() now). Nevertheless, the String class should of had a static intern() method a long time ago that accepts a character array. Boy would it have been convenient... Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Fri Feb 5 02:07:15 1999 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:08:38 2004 Subject: Component Markup Language Message-ID: <000f01be50ac$0c544980$04addccf@ix.netcom.com> >> One cannot rely on "HTML 5" as a solution to this problem >> by reading >> current Working Draft: >> http://www.w3.org/TR/1998/WD-html-in-xml-19981205/ The current working draft, as it makes clear is just part of a series of documents. This WD will lead to a simple document providing XML namespaces, and XML DTD's for XHTML. I believe the current working group will be also looking at the whole of forms and interfaces, so if people have any requirements for an HTML interface now is the time to make them known. I know that many members of the current W3C HTML WG monitor this list, and as XHTML is XML I would hope that such discusion would be proper on this list. Discussion of of HTML can also be carried out on the W3C publiic mailing lists. Frank ----- Original Message ----- From: Mark Birbeck <Mark.Birbeck@iedigital.net> To: <master@rosethorns.com>; Mark Birbeck <Mark.Birbeck@iedigital.net>; <xml-dev@ic.ac.uk> Sent: Thursday, February 04, 1999 4:01 PM Subject: RE: Component Markup Language . The next >> generation of HTML will not provide any such facilities. Information >> about the nature of the next generation HTML can be obtained >> by reading >> current Working Draft: >> http://www.w3.org/TR/1998/WD-html-in-xml-19981205/ > >Can people please be specific, so I can see if I have lost the plot >here: > >If I was to propose a component-XML DTD that began with a tag called >HTML, then others called HEAD and BODY, and it had definitions for text >fields, buttons, drop-boxes, images, tables and more, as well as being >able to call up external programs and all the rest of it; what >ADDITIONAL features is everyone saying they would want to be present in >*their* component-XML that are not in 'mine' (and I really think the >lack of graphics on a button is insufficient justification for spending >the next year writing an entire XHTML!) > >I'm not saying that the user-interface has to then be a web browser - >make it Java if you want. All I'm saying is there is a very good syntax >in place for defining user interfaces - why not use it? > >Mark Birbeck >Managing Director >Intra Extra Digital Ltd. >39 Whitfield Street >London >W1P 5RE >w: http://www.iedigital.net/ >t: 0171 681 4135 >e: Mark.Birbeck@iedigital.net > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Fri Feb 5 02:11:04 1999 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:08:38 2004 Subject: Component Markup Language Message-ID: <000d01be50ac$07f13d80$04addccf@ix.netcom.com> >HTML 4 isn't quite up to providing a full GUI, even with the DOM. For >example you can't do menu's. Thats news to me! You can't do buttons with images on them (I'm >not talking about images that are buttons), What about the <button> element? <button> <img src="stop.gif"> <br>Stop!! </button> willl give as good a button with an image as VB or C++ you can't do tabbed dialogs >(well, you can, but it's non-trivial), It's almost trivial using CSS. Use the z layer property. I'm sure there are other things. Of >course you could add these things into HTML 5, but I don't think it's worth >going down that road. X HTML, (I don't think that I am selling any state secrets here) will include a complete rewrite of 'forms', including new interfaces. I for one, and I'm sure that other members of the HTML WG would be very interested in learning what extra needs people have for the GUI, or for information handling on the client side. Frank (speaking on my own behalf) ----- Original Message ----- From: Matthew Sergeant (EML) <Matthew.Sergeant@eml.ericsson.se> To: <xml-dev@ic.ac.uk> Sent: Thursday, February 04, 1999 12:18 PM Subject: RE: Component Markup Language >> -----Original Message----- >> From: Mark Birbeck [SMTP:Mark.Birbeck@iedigital.net] >> >> Great idea. Can I suggest we call it HTML? >> >> (OK I know I've cracked that one already - it's the end of the week. But >> really, if you want an XML specification for a user interface, surely >> HTML 4.0 is the one to choose. Then you could use a 'cool browser' that >> has an "XML GUI Interpreter" built in - like, well, IE4 and Netscape 4.) >> >HTML 4 isn't quite up to providing a full GUI, even with the DOM. For >example you can't do menu's. You can't do buttons with images on them (I'm >not talking about images that are buttons), you can't do tabbed dialogs >(well, you can, but it's non-trivial), I'm sure there are other things. Of >course you could add these things into HTML 5, but I don't think it's worth >going down that road. > >Matt. >-- >http://come.to/fastnet >Perl on Win32, PerlScript, ASP, Database, XML >GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V >!PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri Feb 5 02:43:40 1999 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:08:38 2004 Subject: Component Markup Language References: <A26F84C9D8EDD111A102006097C4CD0D054961@SOHOS002> Message-ID: <36BA5A3F.180A@hiwaay.net> Mark Birbeck wrote: > > If I was to propose a component-XML DTD that began with a tag called > HTML, then others called HEAD and BODY, and it had definitions for text > fields, buttons, drop-boxes, images, tables and more, as well as being > able to call up external programs and all the rest of it; what > ADDITIONAL features is everyone saying they would want to be present in > *their* component-XML that are not in 'mine' (and I really think the > lack of graphics on a button is insufficient justification for spending > the next year writing an entire XHTML!) Been there. Done that. Got the T-shirt (says, "More Meta Than Thou!") See the US Navy MID (Metafile for Interactive Documents) project. There are probably references to it from the OASIS page. I don't know if the original designs are still out there. I believe parts of it mutated into the ISMID which is an ISO project. Check with Dave Cooper and Mike Anderson. When I last looked, there wasn't much left of the original MID, but as a design project, it occupied some serious people for a few years. It was implemented several times by different teams, worked fine, and basically was condemmed as clunky and "not the way I would do it". OTOH, it met the requirements and did the job. IOW, what you want to do works fine but you have to find a market for it. len bullard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Fri Feb 5 02:46:04 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:08:38 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) In-Reply-To: <36B92536.D5BF3AA0@prescod.net> Message-ID: <003701be50b0$f80fe510$d3228018@jabr.ne.mediaone.net> What I am talking about is an interface to a database. The question is: should we use an interface such as ODBC, JDBC or a proprietary interface (each vendor has one), and/or the DOM. It has been suggested that when the database is to represent an XML repository of some sort, the DOM may be one appropriate interface onto this repository. In particular object database vendors such as ODI and Poet who are offering XML databases may choose to offer a DOM interface onto the database. Relational database vendors such as Oracle may also choose to offer DOM interfaces onto the database. It has been stated that this is a misguided approach, and that what is needed is object orientation. My response is that this is comparing apples to motorcycles. Modern systems are described at different levels and this is a useful way to organize this discussion which is in danger of becoming incoherent. Many current systems are described in terms of three broad tiers: a client tier, a middleware tier and a server/database tier. When these systems are designed, there exists a need to interface the tiers. When business rules exist on a middleware tier, and these rules are designed as a set of components or objects, these 'high level' objects which contain domain specific knowledge, have a need to communicate with the database tier. Frequently a set of data access objects is defined to facilitate interface with the database. We are talking about different levels. Not for a moment am I arguing with the need for business rules. The question is how these object rules should access the database. In modern middleware/transaction processing systems, the idea is to 'dumb down' the database and place the knowledge in the business rule layer (which is above the data access layer). Two tier systems mix the business rules with the data, this leads to inefficiency. A three tier system separates the business rules from the data. Yet low level API's such as ODBC or vendor specific API's tie the business rules to the particular database engine. With proper object orientation, and use of encapsulation, it is best to uncouple the business rules from the particular database. In other cases, a data access object layer is used to couple a 'high level' or scripting language to a low level binary wire protocol. For example, Microsoft offers ADO with its Recordset. I suggest that the DOM is no less capable of modelling data than a Recordset, where a Recordset is optimized to deliver tabular data, the DOM is optimized to deliver tree data. Perhaps you don't like the DOM and have a better tree API ... great! let's see it and then if everyone agrees that it is indeed better, it can be DOM 2. The concept of the 'object repository' or 'enterprise repository' includes the idea that code and data can be mixed together in a large bucket. These systems tend to have difficulties in high transaction volume applications. They have tended to be overly complex and hence overly expensive for this reason. At the same time the proper 3 tier web application (browser client, HTTP server/middleware, back-end database) has really taken off in the last couple of years (ask Dell :-) By maintaining a distinct business rules layer from the data access layer, the system can be more easily organized and more easily modified. This leads to higher reliability and lower cost. These are good things. So lets compare apples to apples. Which data access API do you wish to use? This API is specifically required to be used by business rules to access data stores. Paul Prescod wrote: > > Right. XML is a serialization. The DOM is an abstraction of a > serialization. Not an abstraction over your data. With objects, the data and code are tied together, we have the idea of a 'live' object and a persisted object. The idea of activating a persisted object means that the persistant object data (e.g. a file) is parsed to restore object state. A DOM object isn't so much an abstraction of a serialization as it is a live 'bucket' of XML data (it is a simple data access object). Business rules operate on data objects in the same way that a JavaScript program can use a DOM object to process some information. If your "problem" is > representing debit card bank accounts the proper abstraction over that is > "bank account" or "currency account". The *wrong* abstraction is "elements > and attributes." Comparing apples to apples, the DOM has elements and attributes, and a Recordset has rows and columns. Most bank accounts in the world today are well represented as rows and columns. There are times when the slightly more complex concept of elements and attributes has a better impedance match to the data being modelled than rows and columns. > > XML is very simple. "All the world's data" is very complex. That's why we > need XLink, RDF, HyTime and a bunch of other stuff. If your API to "your > data" is simply the DOM then you are temporarily hiding its complexity > behind a simple layer that can *NOT* express its "linkedness", its complex > class relationships, is geographical 2D-ness etc. I don't know your data. > I don't know what makes it complex but if your job is interesting it IS > complex and the DOM does not help you to manage that complexity. That's right, that's why *I* need to model my domain using *my* business rules. And assuming I don't want to code my own database, I need an API that allows me to store my stuff. > > > It is a big mistake to assume that XSL is only to be > used for 'styling for > > the Web or print'. DSSSL as we know is based upon or employs > Scheme. Scheme > > is a full fledged programming language, a dialect of LISP. XSL > does not have > > the Scheme counterpart to DSSSL, rather it is itself its own programming > > language (albeit currently simplistic). > > XSL is not a programming language according to the Turing/Church > definition. Perhaps not yet, but if I want to automate transforms, XSL or the transformation language subset 'XTL' is a leap in the right direction. A large part of computer programming consists of interfacing one API to another. I'm not saying that XSL helps with this at all but pointing out that transformations and impedance matching is an important task. If we have the ability to express transforms directly this greatly reduces the need to do traditional coding and bit twiddling. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Fri Feb 5 03:27:26 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:38 2004 Subject: HDBMS vs RDBMS (Was: Re: Storing Lots of Fiddly Bits (was Re: What is XML for?)) References: <003701be50b0$f80fe510$d3228018@jabr.ne.mediaone.net> Message-ID: <36BA643D.BC1AA23A@manhattanproject.com> Jonathan, First, I like what you wrote. It makes sense to me. :) "Borden, Jonathan" wrote: > > Comparing apples to apples, the DOM has elements and attributes, and a > Recordset has rows and columns. Most bank accounts in the world today are > well represented as rows and columns. There are times when the slightly more > complex concept of elements and attributes has a better impedance match to > the data being modelled than rows and columns. This is, in essence, the debate of the 70's between the hierarchical model (HDBMS) and the relational model (RDBMS). The relational people "won", in part, beacuse they had a mathematical theory which formally defined how their database works. I see MURATA Makoto's work as being the mathematical formalism required to explain how a hierarchical database would work. This to me is exciting. If anything, I would say that any *reasonable* database in the future must handle both and what would be wonderful to see is a mathematical formalism that allowed both perspectives to work in a complementary fashon. Already, relational databases are adding hierarchical features, witness the "CONNECT BY" clause in Oracle. And, the hierarchical people are busily adding relational features (XML Link). I think the problem is that the data needs to be both viewed as a set of relations _and_ as a hierarchy. I feel that it will be tempting to "toss out" relational theory in favor of hierarchial databases. I think the true solution will involve some sort of "DUAL" which allows for a gateway between the "world of sets" and the "world of trees". Perhaps objects provide the language necessary to unify these two different pictures of information. > Perhaps not yet, but if I want to automate transforms, XSL or the > transformation language subset 'XTL' is a leap in the right direction. A > large part of computer programming consists of interfacing one API to > another. I'm not saying that XSL helps with this at all but pointing out > that transformations and impedance matching is an important task. If we have > the ability to express transforms directly this greatly reduces the need to > do traditional coding and bit twiddling. The XTL is, in effect, the equivalent of SQL for a relational database. An SQL statement takes one or more relations and produces another relation. So true that XTL will do a similar thing to trees. This is indeed very exciting. After XTL is worked out, then we only have two more transformations left, RDBMS->HDBMS and HDBMS->RDBMS. And neither of these is trivial. Thoughts? Clark Evans xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Fri Feb 5 04:09:53 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:08:38 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) In-Reply-To: <36B93A9A.EEC43FFB@manhattanproject.com> Message-ID: <000101be50bc$bc0d8930$d3228018@jabr.ne.mediaone.net> Clark Evans wrote: > Therefore, I picture a DOM implementation > using thousands of nested queries to generate > the same tree that a few large queries would have > handled nicely. In this case, the database > engine would not be able to take advantage > of aggregate indexing and elimination > algorithems. In effect, negating the benifits > of having corporate information in a relational > database. *smile* This is a good point, the problem with the relational model is that the few large queries flatten the hierarchy into a table. Another approach is to use data shaping to create hierarchical recordsets. A third approach is to let the database vendors optimize the queries internally when XML is to be returned. This would seem to be the optimal situation and data shaping/OLAP techniques fit here ... remember "Arbor Software" is an OLAP vendor. > > Anyway, I just can't picture using DOM in this > context as an interface to a relational database. > For this case I feel using a stream-oriented solution > on the server with an object-oriented event processing > system on the client seems the better approach. > > Again, the db/OLAP vendors might be able to squeeze some performance out of the interface that we couldn't once the data has left the engine. just a thought. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Fri Feb 5 05:02:36 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:38 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) References: <000101be50bc$bc0d8930$d3228018@jabr.ne.mediaone.net> Message-ID: <36BA7AAB.160E9065@manhattanproject.com> Jonathan Borden wrote: > Another approach is to use data shaping to create hierarchical recordsets. This is what I did at Ford Motor Co., it gets messy quick. Even when the table is a snapshot. Nothing like a big hierarchical table with 400 columns. *evil grin* > A third approach is to let the database vendors optimize the queries > internally when XML is to be returned. This would seem to be the > optimal situation and data shaping/OLAP techniques fit here ... > remember "Arbor Software" is an OLAP vendor. Nice. This could be the beginnings of a HDBMS/RDBMS blend. That'd be cool. The danger is thinking that OLAP (HDBMS) is the total answer. It is almost impossible to avoid duplication in a hierarchical database. For transaction processing, I bet that a relational database will still be the best solution. Ballance, Clark ----------------------------------------------------- software n : 1. written programs or procedures or rules and associated documentation pertaining to the operation of a computer system. 2. applied philosophy xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Fri Feb 5 05:22:06 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:38 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) References: <003701be50b0$f80fe510$d3228018@jabr.ne.mediaone.net> Message-ID: <36BA7E38.D2B7E33B@prescod.net> "Borden, Jonathan" wrote: > ...[lots and lots] ... and then ... > So lets compare apples to apples. Which data access API do you wish to use? I don't want an API. I want layers of objects. At the bottom level I have either storage objects or records in a table. At the higher layers I have abstractions over those objects. Using objects I can build a 1-tier, 2-tier, 3-tier or n-tier system. I can have as many levels of business rules and clients and servers as I need. I can also query objects and build object schemas using standardized, multiply-implemented languages. Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matt at veosystems.com Fri Feb 5 08:05:24 1999 From: matt at veosystems.com (matt@veosystems.com) Date: Mon Jun 7 17:08:38 2004 Subject: Component Markup Language In-Reply-To: <4EB3E7E7.1B5AD737@darmstadt.gmd.de> from "Robb Shecter" at Nov 4, 11 02:25:59 pm Message-ID: <19990205080503.30544.qmail@veosystems.com> Robb, You might find an old paper of mine - The User Interface as Document - interesting. It's still available from my old NYU home page at http://cs.nyu.edu/phd_students/fuchs Matthew Fuchs matt@veosystems.com now with CommerceOne > > Hi, > > Has anyone thought about or worked on an markup language to describe a > User Interface in a platform independent way? > > I'd think that this would language would describe: instantiating > components, adding them to containers, and configuring interactions > between them. > > Thanks, > - Robb > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From hpyle at agora.co.uk Fri Feb 5 08:34:22 1999 From: hpyle at agora.co.uk (hpyle@agora.co.uk) Date: Mon Jun 7 17:08:39 2004 Subject: SAX, Java, and Namespaces (was Re: Restricted Namespaces for XML) Message-ID: <8025670F.002EF9AD.00@mailhost.agora.co.uk> Tyler wrote, > If XML is ever going to be editable by an average > internet user for some common tasks, meaningful prefixes do matter. Can I ask for some "average user" stuff here? Now that the namespace arguments have been rounded up a few times. Although James Clarks' page does explain the state of affairs very clearly, it's still talking to the wrong people. What I & my other average-developer colleagues need is more like a "cookbook". Not the intricacies of why flour plus butter behave so strangely; we need to know how to roll the pastry out. Current interest in XML - in the absence of many standardised industry DTDs - means there's a pressing need to explain how to design sensible XML structures. Whilst there are plenty of XML examples appearing on the Web, few of them are well-designed (my pet peeve: people coding a date as "2/21/99"). Namespaces seem to be an essential solution to two problems encountered when designing XML stuctures: - how can I distinguish my tags from everyone else's, to avoid confusion (eg: "<my:pastry/>"); - how can I use a common repository of meaningful tags at the same time (eg: "<frozen:pastry><my:sauce iso:litres_quantity="0.5"/></frozen:pastry>") With a few examples. Finally, we need directions to the local pizza house... -Hugh agora hpyle@agora.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Fri Feb 5 09:01:27 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:08:39 2004 Subject: CORBA's not boring yet. / XML in an OS? Message-ID: <5F052F2A01FBD11184F00008C7A4A80001136AF6@eukbant101.ericsson.se> > -----Original Message----- > From: Robb Shecter [SMTP:shecter@darmstadt.gmd.de] > > Anyhow, this naturally makes me wonder - could XML and related ideas > like XSL have a place in an operating system? Where would they fit in? > KDE and Gnome could be great playgrounds for trying something like this > out. > Gnome already does. It ships with a libxml if you look at it. I'm not sure how much it's used yet, but the Linux people are really grabbing XML by the horns. Both AbiWord and KWord (2 new word processors) have their native file format as XML. Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Fri Feb 5 09:05:10 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:08:39 2004 Subject: A weaker XSL? Message-ID: <5F052F2A01FBD11184F00008C7A4A80001136AF7@eukbant101.ericsson.se> > -----Original Message----- > From: Clark Evans [SMTP:clark.evans@manhattanproject.com] > > Paul Prescod wrote: > > Exactly :) With a weak-XSL you coudn't build the table of > contents BEFORE the entire stream is read, the same with > sorting, you can't do it before the entire stream is read. > > However, it can do other things, like turning a <catalog> > into a <table>, building a index or trailing table of > contents, etc. Thus it still has use for translating > XML into HTML and other output forms, while not requiring > the memory and processing overhead. > I might have misunderstood, but what's the point of XSL in that case then? Why not just use perl to do s/<(\/?)catalog>/<$1table>/ ? (obviously you can be more complete than this, I just wanted a simple example). Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Fri Feb 5 09:32:59 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:08:39 2004 Subject: Fw: Namespaces does *not* formally introduce "global attributes" Message-ID: <007701be50e9$c95670c0$5402a8c0@oren.capella.co.il> Ooops, sent this to the wrong lost. Sorry. -----Original Message----- >I asked: >>>Actually, has anyone already written a SAX filter which implements >>>namespaces this way? Is anyone interested? Let's settle this once and for >>>all - "proof by implementation" :-) > >Bill la Forge <b.laforge@jxml.com> wrote: >>Well, there's John Cowan's filter: >> >> http://www.ccil.org/~cowan/XML/NamespaceFilter.java > > >Thanks - and thanks, John. Just the thing. I hope this lays the issue to >rest... > >Share & Enjoy, > > Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Fri Feb 5 09:36:03 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:08:39 2004 Subject: CORBA's not boring yet. / XML in an OS? Message-ID: <5F052F2A01FBD11184F00008C7A4A80001136AF9@eukbant101.ericsson.se> > -----Original Message----- > From: Dave Winer [SMTP:dave@userland.com] > > Gotta check it out! XML-RPC.. What CORBA wants to be. ;-> > > http://www.xmlrpc.com/ > Isn't RPC using IIOP and DCOM already slow enough without using XML? Why don't MS just support IIOP? Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Fri Feb 5 09:48:58 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:08:39 2004 Subject: Component Markup Language Message-ID: <5F052F2A01FBD11184F00008C7A4A80001136AFA@eukbant101.ericsson.se> > -----Original Message----- > From: Frank Boumphrey [SMTP:bckman@ix.netcom.com] > > >HTML 4 isn't quite up to providing a full GUI, even with the DOM. For > >example you can't do menu's. > > Thats news to me! > The HTML WG seem to have a different concept of what menus are to normal application developers. I mean an application that has it's own window with it's own menus. No back button - the window should be totally defined by the XML. Without using Javascript tricks. I didn't mean the <select> tag - or even the deprecated <menu> tag. I mean drop down menus that can have submenus. > You can't do buttons with images on them (I'm > >not talking about images that are buttons), > > What about the <button> element? > > <button> > <img src="stop.gif"> > <br>Stop!! > </button> > > willl give as good a button with an image as VB or C++ > :-) Missed that I guess. > you can't do tabbed dialogs > >(well, you can, but it's non-trivial), > > It's almost trivial using CSS. Use the z layer property. > Yes - almost trivial, but the design of the tabs is up to the designer - it's not provided by the OS. > I'm sure there are other things. Of > >course you could add these things into HTML 5, but I don't think it's > worth > >going down that road. > > X HTML, (I don't think that I am selling any state secrets here) will > include a complete rewrite of 'forms', including new interfaces. > > I for one, and I'm sure that other members of the HTML WG would be very > interested in learning what extra needs people have for the GUI, or for > information handling on the client side. > Well, I think we're really talking apples and oranges here. I want to be able to write applications in XML, with some embedded Javascript, or Perl. I don't want to write forms because then I'm stuck with the whole of the browser (i.e. it's buttons and menus, or I can open a new window with no buttons and menus, but then I can't add menus, and the buttons don't integrate as a toolbar, and I can't have a nicely integrated status bar...). Basically I want XUIL - and I can't wait until it's ready for prime time. Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Fri Feb 5 10:03:20 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:08:39 2004 Subject: Namespace clashes? Message-ID: <01BE50F6.34DD5550@grappa.ito.tu-darmstadt.de> Nikita Ogievetsky wrote: > I thought of namespace prefixes as aliases. If parser is capable > to resolve alias <=> alias definition than it should not matter which > alias one uses to define the same namespace? Correct. But in the question asked, the prefixes resolved to different namespaces. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dave at userland.com Fri Feb 5 12:58:35 1999 From: dave at userland.com (Dave Winer) Date: Mon Jun 7 17:08:39 2004 Subject: CORBA's not boring yet. / XML in an OS? In-Reply-To: <5F052F2A01FBD11184F00008C7A4A80001136AF9@eukbant101.ericss on.se> Message-ID: <3.0.6.32.19990205050217.00f49460@scripting.com> >>Isn't RPC using IIOP and DCOM already slow enough without using XML? Why don't MS just support IIOP? I don't know why MS does anything, I don't work there. I know why I like XML-RPC. It's simple. You can write a client in a few hours, and a server in a couple of days. A JavaScript programmer can learn how to do it, today, in less than 24 hours, and in a few weeks, they'll be able to do it in minutes. I have some theories about why this is true: 1. HTTP is everywhere. Does IIOP run on my machine? I'm pretty sure it doesn't. And it's not just about Microsoft, I have Macs too. 2. XML looks like HTML. To someone who has mastered HTML the transition is easy. Performance matters, for sure. But sometimes people overlook that people performance is probably the single most important limiting factor. I can tell you this, CORBA wasn't designed for my mind. It's so complicated, so many new concepts to understand. I even had trouble understanding HTML way back when. Wire protocols can be optimized, but people's brains move at their own rate, rejecting things that appear too complicated, waiting for something that makes sense to them. We're all busy! The problem with COM, CORBA and Apple Events is that each of them were invented before the web exploded and are quite platform specific, and are not understandable to people who do web development. Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Veillard at w3.org Fri Feb 5 13:37:02 1999 From: Daniel.Veillard at w3.org (Daniel Veillard) Date: Mon Jun 7 17:08:39 2004 Subject: CORBA's not boring yet. / XML in an OS? In-Reply-To: <5F052F2A01FBD11184F00008C7A4A80001136AF6@eukbant101.ericsson.se>; from Matthew Sergeant (EML) on Fri, Feb 05, 1999 at 10:01:00AM +0100 References: <5F052F2A01FBD11184F00008C7A4A80001136AF6@eukbant101.ericsson.se> Message-ID: <19990205083605.Q25569@w3.org> Quoting Matthew Sergeant (EML) (Matthew.Sergeant@eml.ericsson.se): > > -----Original Message----- > > From: Robb Shecter [SMTP:shecter@darmstadt.gmd.de] > > > > Anyhow, this naturally makes me wonder - could XML and related ideas > > like XSL have a place in an operating system? Where would they fit in? > > KDE and Gnome could be great playgrounds for trying something like this > > out. > > > Gnome already does. It ships with a libxml if you look at it. I'm > not sure how much it's used yet, but the Linux people are really grabbing > XML by the horns. Both AbiWord and KWord (2 new word processors) have their > native file format as XML. I happen to be the author of that library. It's in use at least in the Gnumeric and Gwp, i.e. the spreadsheet and word processor, (gzipped) XML is the native format for those. I know that a couple of other Gnome project uses that library too. XML is very much in the spirit of Linux, and this community will certainly be very active as soon as XML will be seen as the best platform and tool independant exchange format. Daniel (not speaking for W3C in that case) -- [Yes, I have moved back to France !] Daniel.Veillard@w3.org | W3C INRIA Rhone-Alpes | Today's Bookmarks : Tel : +33 476 615 257 | 655, avenue de l'Europe | Linux, WWW, rpmfind, Fax : +33 476 615 207 | 38330 Montbonnot FRANCE | rpm2html, Kaffe, http://www.w3.org/People/W3Cpeople.html#Veillard | badminton, and Amaya. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From CBenedet at Bluestone.com Fri Feb 5 13:44:54 1999 From: CBenedet at Bluestone.com (Benedetto, Christopher) Date: Mon Jun 7 17:08:39 2004 Subject: No subject Message-ID: <9A4DF69E3C5ED211B86400A0C9D17760847234@thor.operations.bluestone.com> unsubscribe xml-dev ============================================ Christopher Benedetto Product Manager Phone: (609) 727-4600 ext. 3024 Fax: (609) 727-5077 Bluestone Software, Inc. 1000 Briggs Road Mt. Laurel, NJ 08054 mailto:cbenedet@bluestone.com http://www.bluestone.com ============================================ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Fri Feb 5 13:55:29 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:08:39 2004 Subject: SAX, Java, and Namespaces (was Re: Restricted Namespaces for XML) In-Reply-To: <8025670F.002EF9AD.00@mailhost.agora.co.uk> Message-ID: <199902051355.IAA32750@hesketh.net> At 07:37 AM 2/5/99 +0000, hpyle@agora.co.uk wrote: >Can I ask for some "average user" stuff here? Now that the namespace >arguments have been rounded up a few times. > >Although James Clarks' page does explain the state of affairs very clearly, >it's still talking to the wrong people. What I & my other >average-developer colleagues need is more like a "cookbook". Not the >intricacies of why flour plus butter behave so strangely; we need to know >how to roll the pastry out. I've been attempting to cover namespaces intelligibly in my latest books, though none of them have appeared yet. Given the controversy, it's not surprising that most 'explanations' are directed at the weirdness and not the simplicity. I may give it a crack and attempt to post a tutorial when I finish these last two books... >Current interest in XML - in the absence of many standardised industry DTDs >- means there's a pressing need to explain how to design sensible XML >structures. Whilst there are plenty of XML examples appearing on the Web, >few of them are well-designed (my pet peeve: people coding a date as >"2/21/99"). Namespaces seem to be an essential solution to two problems >encountered when designing XML stuctures: >- how can I distinguish my tags from everyone else's, to avoid confusion >(eg: "<my:pastry/>"); >- how can I use a common repository of meaningful tags at the same time >(eg: "<frozen:pastry><my:sauce >iso:litres_quantity="0.5"/></frozen:pastry>") >With a few examples. Finally, we need directions to the local pizza >house... I'd love to see a 'good examples and best practices'-oriented site for XML, but haven't found one yet, though there are pieces everywhere. David Megginson and Rick Jelliffe's books are close, but both are pre-namespaces as well. Now, about that pizza house... Simon St.Laurent XML: A Primer / Building XML Applications (March) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Fri Feb 5 13:57:07 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:39 2004 Subject: Namespaces does *not* formally introduce "global attributes" Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054968@SOHOS002> > Additionally, implementers would have been able to easily add > a namespace > processing module on top of their current XML parsers (a SAX namespace > expansion filter, for example, is trivial when implemented this way), > _without changing the interfaces_. Future implementations > might use better > interfaces - such as APIs for accessing just the "namespace > part" or the > "local part" of an expanded name - but the point is every XML > application > would go on working as it is, without any changes. Reminds me of a question I had a while back: what happens to a perfectly acceptable XML 1.0 document run through an XML parser which has a namespace processing module? This is, after all, valid XML 1.0: <this:is:my:good this:is:an:attribute:called:a1="1" /> (As is: <:::: :::="1" /> ) In terms of the old document run through the new parser, as far as namespaces go this should be no different to: <good a1="1" /> But in the new parsers it will be an error, because, as the spec says, "The namespace prefix, unless it is xml or xmlns, must have been declared ..." It seems that XML namespaces are not backwards compatible with 'old' documents. If this is true, is it explicitly justified anywhere? I haven't come across it. Perhaps it is the intention of the spec that a 'non-conformant' document (i.e., more than one colon in names, etc.) simply 'drops back' to XML 1.0, rather than being 'failed' by the namespace processor. But this then means you couldn't merge two DTDs in a document - one built with namespaces in mind, and one not. Maybe another approach is that if a prefix is *not* declared using an xmlns tag then the prefix and its local name are to all intents and purposes just one name. This would allow you to mix DTDs, as I mention, but would then mean you don't have 'conforming and non-conforming' documents, you have 'conforming and non-conforming' elements - but it would work. And maybe none of this matters and we're just assuming that XML is 'young' enough for us to catch the problem early, everyone is already starting to conform, and in which case on completion of the namespaces spec we would have to declare XML 1.1, and say that it is *not* backwards compatible. Anyway, I would be interested to know if anyone has seen an explicit reference to this - either saying I'm wrong and namespaces *are* backwards compatible, or justifying why it is OK to not be backwards compatible. Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Fri Feb 5 13:58:19 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:39 2004 Subject: A weaker XSL? References: <5F052F2A01FBD11184F00008C7A4A80001136AF7@eukbant101.ericsson.se> Message-ID: <36BAF65C.4FFD371B@prescod.net> "Matthew Sergeant (EML)" wrote: > > I might have misunderstood, but what's the point of XSL in that case > then? Why not just use perl to do s/<(\/?)catalog>/<$1table>/ ? > > (obviously you can be more complete than this, I just wanted a > simple example). That's an odd question. I don't see that there are some problems that are "too simple" for XSL. The simpler the problem, the more sense it makes to use XSL. We use XSL because it is: * standardized * declarative * optimizable * can be implemented in the heart of a repository * has many competing implementations We use Python because it is: * flexible * highly extensible * has a powerful standard library I'm told that these are reasons also to use Perl if you can stand it. Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Feb 5 14:08:49 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:39 2004 Subject: Namespaces does *not* formally introduce "global attributes" In-Reply-To: <A26F84C9D8EDD111A102006097C4CD0D054968@SOHOS002> References: <A26F84C9D8EDD111A102006097C4CD0D054968@SOHOS002> Message-ID: <14010.64038.298058.609899@localhost.localdomain> Mark Birbeck writes: > Reminds me of a question I had a while back: what happens to a perfectly > acceptable XML 1.0 document run through an XML parser which has a > namespace processing module? This is, after all, valid XML 1.0: > > <this:is:my:good this:is:an:attribute:called:a1="1" /> > > (As is: > > <:::: :::="1" /> > > ) Exactly. XML 1.0 remains in force, so for general-purpose XML parsers, namespace processing needs to be an optional feature, enabled at user discretion (as it is in Expat). Dedicated tools for RDF, XSL, etc. will, of course, need namespace processing on. Further to this topic, the following note does appear in the XML 1.0 spec: Note: The colon character within XML names is reserved for experimentation with name spaces. Its meaning is expected to be standardized at some future point, at which point those documents using the colon for experimental purposes may need to be updated. (There is no guarantee that any name-space mechanism adopted for XML will in fact use the colon as a name-space delimiter.) In practice, this means that authors should not use the colon in XML names except as part of name-space experiments, but that XML processors should accept the colon as a name character. In other words, you're allowed to use ':' for non-namespace purposes but you're playing with fire if you do. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Fri Feb 5 14:10:26 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:39 2004 Subject: SAX, Java, and Namespaces (was Re: Restricted Namespaces forXML) References: <8025670F.002EF9AD.00@mailhost.agora.co.uk> Message-ID: <36BAF974.6F87E679@prescod.net> hpyle@agora.co.uk wrote: > > Although James Clarks' page does explain the state of affairs very clearly, > it's still talking to the wrong people. I don't think that "average developers" need to worry about namespaces. It is quite simple to build powerful, useful applications without them. I mean if you are implementing RDF or XSL then you need them, but short of that, I wouldn't bother. > Namespaces seem to be an essential solution to two problems > encountered when designing XML stuctures: > - how can I distinguish my tags from everyone else's, to avoid confusion > (eg: "<my:pastry/>"); Actually, this problem is very RARELY encountered. If you are building a typical one-organization application then what are you doing with "other people's tags" in your documents? I mean, if you are writing typical software, it will choke and die when it comes upon tags it does not know about. > - how can I use a common repository of meaningful tags at the same time > (eg: "<frozen:pastry><my:sauce > iso:litres_quantity="0.5"/></frozen:pastry>") You don't need namespaces for that. Just create a "liters_quantity" element type and include it in your various DTDs with parameter entities. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Fri Feb 5 14:12:39 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:08:39 2004 Subject: A weaker XSL? Message-ID: <5F052F2A01FBD11184F00008C7A4A80001136B01@eukbant101.ericsson.se> > -----Original Message----- > From: Paul Prescod [SMTP:paul@prescod.net] > > "Matthew Sergeant (EML)" wrote: > > > > I might have misunderstood, but what's the point of XSL in that > case > > then? Why not just use perl to do s/<(\/?)catalog>/<$1table>/ ? > > > > (obviously you can be more complete than this, I just wanted a > > simple example). > > That's an odd question. I don't see that there are some problems that are > "too simple" for XSL. The simpler the problem, the more sense it makes to > use XSL. > I guess what I should have said was "Why not use CSS then". If we're talking about an XSL that doesn't do transformations then it's CSS you should use. The perl example I guess was a bad idea, but I just meant what we seemed to be talking about was tag matching/replacing with programmability. CSS2 covers that. > I'm told that these are reasons also to use Perl if you can stand it. > You missed a smiley there I assume. Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From skshirsa at nortelnetworks.com Fri Feb 5 14:26:58 1999 From: skshirsa at nortelnetworks.com (Shekhar Kshirsagar) Date: Mon Jun 7 17:08:40 2004 Subject: 'How to unsubscribe?' information is attached to every mail. Message-ID: <3.0.32.19990205092431.0072499c@bl-mail2.corpeast.baynetworks.com> In spite of the fact that "hot to unsubscribe" information is attached to each an devery mail, I don't know why do people broadcast their message to unsubscribe. At 08:39 AM 2/5/99 -0500, Benedetto, Christopher wrote: >unsubscribe xml-dev > >============================================ >Christopher Benedetto >Product Manager >Phone: (609) 727-4600 ext. 3024 >Fax: (609) 727-5077 >Bluestone Software, Inc. >1000 Briggs Road >Mt. Laurel, NJ 08054 >mailto:cbenedet@bluestone.com >http://www.bluestone.com >============================================ > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dave at userland.com Fri Feb 5 14:31:56 1999 From: dave at userland.com (Dave Winer) Date: Mon Jun 7 17:08:40 2004 Subject: HTML, XML, XML-RPC in one net app Message-ID: <3.0.6.32.19990205063526.00f64140@scripting.com> We're getting close to releasing Frontier 6, and as part of that process we created a demo app that's accessible thru HTML, XML and XML-RPC. The app is now on the air, ready for you to check out and think about and possibly learn from. ***HTML interface First, here's the HTML interface. http://www.mailtothefuture.com/ Please log on, get a password, create a message or two, became familiar with how it works from a user's point of view. You'll definitely want to have a couple of messages in your queue to try out the other examples. ***XML interface Now, thru your web browser, visit these two pages: http://www.mailtothefuture.com/msgcounter.xml http://www.mailtothefuture.com/msgreader.xml$1 There you go, dynamic XML. Now, what's it good for? Not much, because you also want to be able to add a message or delete a message, and for that you need to make a procedure call. ***XML-RPC interface The XML-RPC interface is documented on this page: http://www.mailtothefuture.com/public/techInfo There are five procedures: mailToTheFuture.addMessage (username, password, msgstruct) mailToTheFuture.deleteMessage (username, password, n) mailToTheFuture.getAllMessages (username, password) mailToTheFuture.getMessage (username, password, msgnum) mailToTheFuture.getMessageCount (username, password) With these five procedures you can access all the functionality of the server without coming in thru the HTML interface. ***Next steps We've already got XML-RPC clients running in the following environments: Python, Perl, Java, Frontier, and are close to having a clean interface to JavaScript running in the popular web browsers. Thru this interface, applications can use the W3C DOM or other XML APIs to walk structures on the MTTF server. We're working with a talented UI development team lead by Marc Canter, the lead developer of Macromedia Director. To us it's a black-box, we've provided wires into our server, and the designers have already figured out how to hook in. We'll go back over their work when they're ready and create a simple browser-based API for calling into our server, and optimize their interface in the CMS running on the server. ***Beyond Hello World I know that many of you on this list weren't involved in the evolution of system scripting on the Mac, but this is feeling a lot like that, this is a simple demo app that's functional and interesting enough to motivate applications, but simple enough so that it isn't a huge chore. We're beyond the Hello World stage in the XML-RPC world. ***Crossing boundaries And XML-RPC is now meeting one of its other objectives, it crosses platform boundaries. We have XML-RPC servers running in Java, Python, Apache, Perl and Frontier. We want XML-RPC to go everywhere, crossing not just technical boundaries, but connecting the open source communities with commercial developers and system integrators. The MTTF server can be cloned in all those environments, and as long as it supports the same XML-RPC interfaces, you can swap a Windows NT server for a Linux server, or vice versa. That may be the most revolutionary feature of XML-RPC. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Fri Feb 5 14:48:47 1999 From: Michael.Kay at icl.com (Michael.Kay@icl.com) Date: Mon Jun 7 17:08:40 2004 Subject: A weaker XSL? Message-ID: <93CB64052F94D211BC5D0010A80013310EB2DA@WWMESS3> > > How do I build a table of contents that is output BEFORE the document > without reading the whole input before I start to output? > You can't, and that isn't my objective. What you can do, and what I currently do with SAXON, is to perform a single serial pass of the input document that simultaneously generates a TOC and the main rendered document on two separate output streams, then merge these together for presentation. This is far more efficient with a large document than building the entire document tree in memory or scanning it twice. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From hpyle at agora.co.uk Fri Feb 5 14:50:40 1999 From: hpyle at agora.co.uk (hpyle@agora.co.uk) Date: Mon Jun 7 17:08:40 2004 Subject: SAX, Java, and Namespaces (was Re: Restricted Namespaces forXML) Message-ID: <8025670F.005165A0.00@mailhost.agora.co.uk> Paul Prescod wrote, > I don't think that "average developers" need to worry about namespaces. ... > If you are building a > typical one-organization application then what are you doing with "other > people's tags" in your documents? Maybe my perspective is a little warped. I'm working on healthcare applications in the UK - interoperability will (sometime) become a big deal. :-) -Hugh agora hpyle@agora.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Fri Feb 5 14:56:51 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:40 2004 Subject: A weaker XSL? References: <5F052F2A01FBD11184F00008C7A4A80001136B01@eukbant101.ericsson.se> Message-ID: <36BB012C.9F11672C@prescod.net> "Matthew Sergeant (EML)" wrote: > > I guess what I should have said was "Why not use CSS then". If we're > talking about an XSL that doesn't do transformations then it's CSS you > should use. The perl example I guess was a bad idea, but I just meant what > we seemed to be talking about was tag matching/replacing with > programmability. CSS2 covers that. I could be wrong, but I don't believe that CSS2 can take XML conforming to one DTD and transform it into XML conforming to another DTD. There is a transformation language called "STTS3" based on CSS syntax but I must admit that I would prefer to see a simple transformation language use a subset of XSL syntax. > > I'm told that these are reasons also to use Perl if you can stand it. > > > You missed a smiley there I assume. >From the Simpson's: Teenager 1: "Man, that cannonball guy is so cool." Teenager 2: "Er. Are you being sarcastic?" Teenager 1: "Uh. Hmm. I can't even tell anymore." -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Fri Feb 5 15:00:21 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:08:40 2004 Subject: New XML Resources page (Re: Using XSL as a Validation Language) Message-ID: <007801be5118$752645a0$5ef96d8c@NT.JELLIFFE.COM.AU> From: Robin Cover <robin@isogen.com> >Hello Rick. Re below: I posted a blurb on my news page. Thanks Robin. >I'd also like to have the URL (did I miss it??) so as to track revisions. The "Chinese XML Now! " site is a project to help developers of Chinese XML Software. The URL is http://www.ascc.net/xml/ We are pleased to announce a page "XML and SGML Resources" in which we are putting some of the results of our research and development: software, declarations, documentation, turials, and technology notes. The URL is http://www.ascc.net/xml/en/utf-8/resource_index.html In the initial offering of this page are: 1) Three SGML declarations for Big5 2) The article "Using XSL as a Structure Validation Language" 3) A new article "lineDataWrap: An Element Set for Line-Delimited Records" , which describes an element set for handling database "dumps" , e.g. comma-delimited lines (CDL). 4) The accompanying "lineDataWrap" DTD. In the next week we plan to augment this with: 5) A lines->XML tranformation software based on the lineDataWrap DTD 6) Updated versions of the "XMLized" ISO public entity sets 7) Misc. XML utilities Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Fri Feb 5 15:06:58 1999 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:08:40 2004 Subject: Component Markup Language Message-ID: <005801be5118$e59a1d20$96acdccf@ix.netcom.com> Matthew wrote: >>Well, I think we're really talking apples and oranges here. I want to be able to write applications in XML, with some embedded Javascript, or Perl. I don't want to write forms because then I'm stuck with the whole of the browser (i.e. it's buttons and menus, or I can open a new window with no buttons and menus, but then I can't add menus, and the buttons don't integrate as a toolbar, and I can't have a nicely integrated status bar...). Basically I want XUIL - and I can't wait until it's ready for prime time.<< The problem here is that to be 'cross platform' one needs a layer between the XML+script, and the platforms API. That essentially means either using Java or a preexisting interface such as a browser or a word processor. I myself use XML and a VB interface to create what you are trying to achieve. I simply pass my insructions to in XML to my VB application. This of course that is not cross platform. Frank ----- Original Message ----- From: Matthew Sergeant (EML) <Matthew.Sergeant@eml.ericsson.se> To: <xml-dev@ic.ac.uk> Sent: Friday, February 05, 1999 4:48 AM Subject: RE: Component Markup Language >> -----Original Message----- >> From: Frank Boumphrey [SMTP:bckman@ix.netcom.com] >> >> >HTML 4 isn't quite up to providing a full GUI, even with the DOM. For >> >example you can't do menu's. >> >> Thats news to me! >> > The HTML WG seem to have a different concept of what menus are to >normal application developers. I mean an application that has it's own >window with it's own menus. No back button - the window should be totally >defined by the XML. Without using Javascript tricks. I didn't mean the ><select> tag - or even the deprecated <menu> tag. I mean drop down menus >that can have submenus. > >> You can't do buttons with images on them (I'm >> >not talking about images that are buttons), >> >> What about the <button> element? >> >> <button> >> <img src="stop.gif"> >> <br>Stop!! >> </button> >> >> willl give as good a button with an image as VB or C++ >> > :-) Missed that I guess. > >> you can't do tabbed dialogs >> >(well, you can, but it's non-trivial), >> >> It's almost trivial using CSS. Use the z layer property. >> > Yes - almost trivial, but the design of the tabs is up to the >designer - it's not provided by the OS. > >> I'm sure there are other things. Of >> >course you could add these things into HTML 5, but I don't think it's >> worth >> >going down that road. >> >> X HTML, (I don't think that I am selling any state secrets here) will >> include a complete rewrite of 'forms', including new interfaces. >> >> I for one, and I'm sure that other members of the HTML WG would be very >> interested in learning what extra needs people have for the GUI, or for >> information handling on the client side. >> > Well, I think we're really talking apples and oranges here. I want >to be able to write applications in XML, with some embedded Javascript, or >Perl. I don't want to write forms because then I'm stuck with the whole of >the browser (i.e. it's buttons and menus, or I can open a new window with no >buttons and menus, but then I can't add menus, and the buttons don't >integrate as a toolbar, and I can't have a nicely integrated status bar...). >Basically I want XUIL - and I can't wait until it's ready for prime time. > > Matt. > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Fri Feb 5 15:13:25 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:08:40 2004 Subject: Colonialism, SAX, Java, and Namespaces In-Reply-To: <36BAF974.6F87E679@prescod.net> References: <8025670F.002EF9AD.00@mailhost.agora.co.uk> Message-ID: <199902051513.KAA01453@hesketh.net> <vent temp="2000"> At 08:00 AM 2/5/99 -0600, Paul Prescod wrote: >I don't think that "average developers" need to worry about namespaces. It >is quite simple to build powerful, useful applications without them. I >mean if you are implementing RDF or XSL then you need them, but short of >that, I wouldn't bother. I would really appreciate if someday the people building W3C specs would acknowledge that 'average developers' actually do have to worry about namespaces, notations, parameter entities, include/ignore sections, and trying to read the specs themselves. If they would then take that knowledge and apply it to the specification-writing process, from start to finish, we might be able to move forward with a lot less back-and-forth about what these things are really supposed to mean. I get the feeling that some of the writers on this list - though _certainly_ not all - view the 'average developer' as some kind of primitive creature that should be shunted aside in the name of progress. This colonialist view (I don't know what else to compare it to - simple elitism seems inadequate) has contributed to the development of a lot of tools that people talk about but very few people understand. If, instead of treating average developers as know-nothings to be conquered (they don't need to know the details, they just need to use it), we treat average developers as potential contributors, we might move farther along with XML. Namespaces themselves, in most cases, aren't that hard to use. They do, however, contain the potential for disaster if applied in certain circumstances in certain ways, and understanding that potential (and how to avoid it) is critical information that's needed for anyone building a namespace-aware application or using those sets of tools. Not all of those people are in the core XML community, and not all of them give a damn about XML, but they may need to know what it takes to use namespaces effectively and safely. </vent> Simon St.Laurent XML: A Primer / Building XML Applications (March) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Fri Feb 5 15:33:11 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:08:40 2004 Subject: Fw: Namespaces does *not* formally introduce "global attributes" References: <3.0.32.19990204162458.00b5d650@pop.intergate.bc.ca> Message-ID: <36BB105A.8416AD93@mecomnet.de> Tim Bray wrote: > > At 07:16 PM 2/4/99 -0500, David Megginson wrote: > >Actually, so far, pretty much everyone seems to have implemented > >namespaces this way, and it's working like a charm: it's standard in > >the very popular Perl XML:Parser module (which uses Expat) > > Seconded. Here's the example I use to teach namespaces to perl programmers; > they get it instantly > That they get it instantly is no surprise: the example does not handle the case which the rcommendation leave open to various interpretations. If the example is to be pertinent, please extend it to illustrate the "unqualified attribute name" case, ideally in the presence of both default and explicitly qualified namespaces for the respective element "type". Then please make the recommendation say the same thing (by assertion rather than negation). > use XML::Parser; > > $xml = "<z xmlns='http://one.org' > xmlns:two='http://two.org'><y1 /><two:y2 /></z>"; > $p = new XML::Parser(Namespaces => 1, > Handlers => { Start => \&STag }); > $p->parse($xml); > > sub STag { > my ($expat, $type) = @_; > my $ns = $expat->namespace($type); > print "Element type $type is from namespace $ns\n"; > } > =========output========================================== > Element type z is from namespace http://one.org > Element type y1 is from namespace http://one.org > Element type y2 is from namespace http://two.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Fri Feb 5 15:39:07 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:40 2004 Subject: SAX, Java, and Namespaces (was Re: Restricted NamespacesforXML) References: <8025670F.005165A0.00@mailhost.agora.co.uk> Message-ID: <36BB0934.E09CAEF2@prescod.net> hpyle@agora.co.uk wrote: > > Paul Prescod wrote, > > I don't think that "average developers" need to worry about namespaces. > ... > > If you are building a > > typical one-organization application then what are you doing with "other > > people's tags" in your documents? > > Maybe my perspective is a little warped. I'm working on healthcare > applications in the UK - interoperability will (sometime) become a big > deal. :-) Even so, people have been solving these sorts of problems without namespaces for years (i.e HL7 initiative). In order to exchange documents in healthcare it is probably NOT necessary to mix elements from different document types *blindly*. That's all namespaces help with. If you know in advance what element types must be mixed in a document of a particular type then you can just choose names that do not clash (or use a simple prefixing convention). -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Fri Feb 5 15:43:33 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:40 2004 Subject: Colonialism, SAX, Java, and Namespaces References: <8025670F.002EF9AD.00@mailhost.agora.co.uk> <199902051513.KAA01453@hesketh.net> Message-ID: <36BB0D56.EDF427B0@prescod.net> "Simon St.Laurent" wrote: > > I would really appreciate if someday the people building W3C specs would > acknowledge that 'average developers' actually do have to worry about > namespaces, notations, parameter entities, include/ignore sections, and > trying to read the specs themselves. In my mind, this is just silly. I've never read the "SQL Specs" (have you?) and I certainly don't know everything there is to know about SQL or relational database technology. That doesn't make me a knuckle-dragging neanderthal. It makes me someone who recognizes that he doesn't need to know EVERYTHING about EVERY technology he uses. That's a losing proposition for a working programmer. I'm also someone who recognizes that it is not appropriate to dumb down everything so that I *CAN* understand it in its entirety. All I ask is that the actual interfaces that I am asked to work with be simplified so that I can follow them. The "interface" to namespaces for a developer is SAX or RDF engine and eventually DOM. *That's* what should be simple. Namespaces are an infrastructure technology. You use them THROUGH something, like RDF. Without something like RDF they are essentially useless. So unless you think that "RDF" is for the "average developer", namespaces are not for the average developer. When/if something like RDF takes off that situation may change. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Fri Feb 5 15:55:35 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:08:40 2004 Subject: Colonialism, SAX, Java, and Namespaces In-Reply-To: <36BB0D56.EDF427B0@prescod.net> References: <8025670F.002EF9AD.00@mailhost.agora.co.uk> <199902051513.KAA01453@hesketh.net> Message-ID: <199902051555.KAA02409@hesketh.net> At 09:25 AM 2/5/99 -0600, Paul Prescod wrote: >I'm also someone who recognizes that it is not appropriate to dumb down >everything so that I *CAN* understand it in its entirety. All I ask is >that the actual interfaces that I am asked to work with be simplified so >that I can follow them. The "interface" to namespaces for a developer is >SAX or RDF engine and eventually DOM. *That's* what should be simple. > >Namespaces are an infrastructure technology. You use them THROUGH >something, like RDF. Without something like RDF they are essentially >useless. So unless you think that "RDF" is for the "average developer", >namespaces are not for the average developer. When/if something like RDF >takes off that situation may change. Fine, Paul. Keep using the 'dumb down' vocabulary to show everyone what you really think of the 'average developer'. Ignore that fact that (unlike SQL) there are very few resources that do explain these structures in readable English (or other languages, to the best of my knowledge.) Assume that interfaces - themselves barely explained - can cover all this up neatly, though the discussion on this list has indicated repeatedly that those interfaces are not yet up to that task. Namespaces are all over Internet Explorer. Millions of people will be encountering them in some form, and for better or worse, that often leads to contagion. Like it or not, these things are going to take off, and we'd do a lot better to accomodate those millions than keep calling them 'dumb.' Simon St.Laurent XML: A Primer / Building XML Applications (March) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Fri Feb 5 15:57:51 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:08:40 2004 Subject: SAX, Java, and Namespaces (was Re: Restricted Namespaces forXML) In-Reply-To: <8025670F.005165A0.00@mailhost.agora.co.uk> Message-ID: <000201be511f$7ab08f60$d3228018@jabr.ne.mediaone.net> Healthcare applications have the same issues as other applictions, with the added 'benefit' of multiple standards issuers and third party 'interested' organizations. For example elements from the hl7: namespace dicom: namespace hicfa: namespace might be combined in a single document. Namespaces aren't that difficult to deal with for the developer, but need to be dealt with by organizations with are developing DTD's,Schemas etc which span standards. Now, getting the U.S. and U.K. to agree on a set of standards (and namespaces) is its own issue :-) This is where XML and XTL has potential, e.g. U.S. HL7 2.3 <-> HL7/XML transport <- XTL -> U.K. EDIFACT XML transport <-> EDIFACT Transformations expressed in XSL/XTL can assist with interface of different messaging systems and integration of different 'flavors' of HL7. > > Paul Prescod wrote, > > I don't think that "average developers" need to worry about namespaces. > ... > > If you are building a > > typical one-organization application then what are you doing with "other > > people's tags" in your documents? > > Maybe my perspective is a little warped. I'm working on healthcare > applications in the UK - interoperability will (sometime) become a big > deal. :-) > Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Fri Feb 5 16:01:50 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:40 2004 Subject: A weaker XSL? References: <5F052F2A01FBD11184F00008C7A4A80001136B01@eukbant101.ericsson.se> Message-ID: <36BB1440.661A32C1@manhattanproject.com> "Matthew Sergeant (EML)" wrote: > I guess what I should have said was "Why not use CSS then". If we're > talking about an XSL that doesn't do transformations then it's CSS you > should use. I'm not suggesting that the weakend XSL woudn't do any transformations, only that the transformations it does be based upon a stream rather than upon an object. If this dosn't make sence, then I'd like to hear more. What I'd rather not see is a "single" language which defines XML->HTML mappings where an intermediate form could increase reusability. Thus, | -> (XSL) -> HTML | / XML -> (XTL) -> XML -> (XSL) -> XML | \ | -> (XSL) -> PDF? | DOM, Server | SAX Client(s) Side Processing | Side Processing | * Ordering | * Filtering * Table of | * Formatting Contents | * Contextual Linking? * Other "shared | * Other "individual and information | preference-oriented generating | stylistic operations" operations" Mathematically speaking, I'd like to see SAX as a sufficient condition for XSL processing, where I'd like to see a full-blown DOM implementation used when it is a necessary condition for XTL. This way items like a table of contents, sorting, and other commonly used transformations can be seperated from the customized, style oriented transformations. :) Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Fri Feb 5 16:02:28 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:08:41 2004 Subject: A New Hope (was Re: Storing Lots of Fiddly Bits (was Re: What is XML for?)) Message-ID: <00c201be5120$a6d3e320$5ef96d8c@NT.JELLIFFE.COM.AU> From: spreitze@parc.xerox.com <spreitze@parc.xerox.com> >Right! I think a significant part of the problem here is that people are realizing that XML's data model is not as expressive as they'd like. For example, XML's entity structure looks like a semi-labelled graph (vertices are labelled (with entity tags) but edges are not labelled), whereas many other data models (e.g., RDF) let you label both the edges and the vertices. HyTime lets you label edges too: anyone can use that (to those people who whinge "but it is too hard" I say "CORBA CORBA CORBA": some things are big and difficult, it doesn't mean they are therefore bad). The question is, should such labelling be part of the language at the lexical level (which XML deals with) or a further layer. It is the old tradeoff that a general purpose system will (probably) be worse at any specific task than a specific system. > It seems to me that one plausible way out of this conundrum is for the XML Schema WG to recognize (1) the need for schemas to be written in terms of more expressive data modelling systems, (2) the need to support a variety of encodings of those data models into XML, and (3) the need for a schema to describe a particular encoding (the one desired for the schema at hand) of the schema's data model into XML. Sounds good. But Mike's comments do betray a wish that XML operated on some other level than the strictly lexical: but it doesn't, except by chance. Rick xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Fri Feb 5 16:07:06 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:41 2004 Subject: A weaker XSL? References: <93CB64052F94D211BC5D0010A80013310EB2DA@WWMESS3> Message-ID: <36BB1520.6271C6A@manhattanproject.com> Michael.Kay@icl.com wrote: > > How do I build a table of contents that is output BEFORE the document > > without reading the whole input before I start to output? > You can't, and that isn't my objective. What you can do, and what I > currently do with SAXON, is to perform a single serial pass of the input > document that simultaneously generates a TOC and the main rendered document > on two separate output streams, then merge these together for presentation. > This is far more efficient with a large document than building the entire > document tree in memory or scanning it twice. I agree. Nice. Very nice. :) Clark -- software n : 1. written programs or procedures or rules and associated documentation pertaining to the operation of a computer system. 2. applied philosophy xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at weblogic.com Fri Feb 5 16:12:09 1999 From: peter at weblogic.com (Peter Seibel) Date: Mon Jun 7 17:08:41 2004 Subject: When to use attributes vs. elements Message-ID: <19990205161816668.AAA169@ashbury.weblogic.com@lawton> Is there an XML philosophy (or an SGML philosophy for that matter) about when to use attributes vs when to use elements when desigining a document type. For example if I was writing a document type for representing (part) java classes I could have something like: <class> <package>x.y.z</package> <name>Foo</name> <!-- other more complicated stuff: definitely elements --> </class> or I could have: <class package="x.y.z" name="Foo"> <!-- other more complicated stuff: definitely elements --> </class> Obviously I wouldn't use attributes, if the value had any structure to it. But in a case like this is there some principle that would give some guidence? -Peter P.S. Actually in the attribute case I think I'd do something more like: <class name="x.y.z.Foo"> and let the application split apart the package part from the base name; that way I could declare name to be ID since it should be unique. -- Peter Seibel Perl/Java/English Hacker peter@weblogic.com Is Windows98 Y2K compliant? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dave at userland.com Fri Feb 5 16:16:32 1999 From: dave at userland.com (Dave Winer) Date: Mon Jun 7 17:08:41 2004 Subject: Colonialism, SAX, Java, and Namespaces In-Reply-To: <199902051555.KAA02409@hesketh.net> References: <36BB0D56.EDF427B0@prescod.net> <8025670F.002EF9AD.00@mailhost.agora.co.uk> <199902051513.KAA01453@hesketh.net> Message-ID: <3.0.6.32.19990205081814.00d50ec0@scripting.com> >>Namespaces are all over Internet Explorer. Millions of people will be encountering them in some form, and for better or worse, that often leads to contagion. Like it or not, these things are going to take off, and we'd do a lot better to accomodate those millions than keep calling them 'dumb.' Then Web Review and O'Reilly and PC Mag will run tutorials. And there will be a Visual Quickstart for Namespaces, and of course Namespaces for Dummies. There's a system in place for handling this stuff. Namespaces in a Nutshell? I think that what you're saying is that XML is quickly becoming a lot bigger than the XML-DEV list. But there's no need to panic.. Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Charles.Gamble at singularity.co.uk Fri Feb 5 16:21:44 1999 From: Charles.Gamble at singularity.co.uk (Charles Gamble) Date: Mon Jun 7 17:08:41 2004 Subject: MSXML 1.0 Message-ID: <1731A8D895B4D2119F94006097E59FEDD27C@singular_s1_nt4.singularity.co.uk> Can MSXML 1.0 be used to read in strings containing XML content? If so, how is this done? An example in script would be useful, showing the XML object being created , the string being declared with XML content and the string then being loaded into the XML object. All the examples I have seen use an XML file to load into the XML object. An example in C++ would also come in handy. Please tell me this is possible. Thanks in advance, Charles Gamble. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Fri Feb 5 16:44:39 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:41 2004 Subject: Colonialism, SAX, Java, and Namespaces References: <8025670F.002EF9AD.00@mailhost.agora.co.uk> <199902051513.KAA01453@hesketh.net> <199902051555.KAA02409@hesketh.net> Message-ID: <36BB1B53.3B90A754@prescod.net> "Simon St.Laurent" wrote: > > At 09:25 AM 2/5/99 -0600, Paul Prescod wrote: > >I'm also someone who recognizes that it is not appropriate to dumb down > >everything so that I *CAN* understand it in its entirety. > > Fine, Paul. Keep using the 'dumb down' vocabulary to show everyone what > you really think of the 'average developer'. I used the term *dumb down* in reference to *myself*. The average developer is a dummy in all but a few fields, just like me. We cannot restrict all fields until they are simple enough for everyone to understand. As someone pointed out recently, we wouldn't have a technologically advancing civilization if we did that. If we cannot agree on that much then further discussion is not going to be productive. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Fri Feb 5 16:46:09 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:41 2004 Subject: SAXON (Was: Re: A weaker XSL?) References: <93CB64052F94D211BC5D0010A80013310EB2DA@WWMESS3> Message-ID: <36BB1F4C.396D2974@manhattanproject.com> <cut> Discussion about a weaker version of XSL where sequential access is a sufficient condition for processing and random access is not the necessary condition. </cut> Michael.Kay@icl.com wrote: > > How do I build a table of contents that is output BEFORE the document > > without reading the whole input before I start to output? > > You can't, and that isn't my objective. What you can do, and what I > currently do with SAXON, is to perform a single serial pass of the input > document that simultaneously generates a TOC and the main rendered document > on two separate output streams, then merge these together for presentation. > This is far more efficient with a large document than building the entire > document tree in memory or scanning it twice. Stream processing! This is what I initially assumed XSL did. It's so nice to hear that I'm not on another planet. It's about time we considered the case when an XML document is a sequentially accessed stream. Wonderful! Michael, how do I get SAXON? I'm excited. :) Clark -- flyweight n : 1. the most confusing part of the otherwise wonderful Design Patterns book 2. when everything looks like a nail xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Charles.Gamble at singularity.co.uk Fri Feb 5 16:49:36 1999 From: Charles.Gamble at singularity.co.uk (Charles Gamble) Date: Mon Jun 7 17:08:41 2004 Subject: FW: MSXML 1.0 Message-ID: <1731A8D895B4D2119F94006097E59FEDD27F@singular_s1_nt4.singularity.co.uk> By the way, I only have IE 4.0 installed. > -----Original Message----- > From: Charles Gamble > Sent: 05 February 1999 16:16 > To: 'xml-dev@ic.ac.uk' > Subject: MSXML 1.0 > > Can MSXML 1.0 be used to read in strings containing XML content? > If so, how is this done? An example in script would be useful, showing > the XML object being created , the string being declared with XML content > and the string then being loaded into the XML object. > All the examples I have seen use an XML file to load into the XML object. > An example in C++ would also come in handy. > Please tell me this is possible. > Thanks in advance, > Charles Gamble. > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Fri Feb 5 16:50:25 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:41 2004 Subject: Namespaces does *not* formally introduce "global attributes" Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054969@SOHOS002> David Megginson wrote: > Mark Birbeck writes: > > Reminds me of a question I had a while back: ... > > Further to this topic, the following note does appear in the XML 1.0 > spec: > > Note: The colon character within XML names is reserved for > experimentation with name spaces. ... > > In other words, you're allowed to use ':' for non-namespace purposes > but you're playing with fire if you do. > Thanks David. I was trying to find it in all of the namespace documentation/comment, but didn't realise that the topic was anticipated in XML 1.0. (With all that that implies! Won't open hornets nest now thought.) Thanks again. Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Feb 5 17:06:06 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:41 2004 Subject: SAX, Java, and Namespaces (was Re: Restricted Namespaces forXML) In-Reply-To: <36BAF974.6F87E679@prescod.net> References: <8025670F.002EF9AD.00@mailhost.agora.co.uk> <36BAF974.6F87E679@prescod.net> Message-ID: <14010.65394.42870.480866@localhost.localdomain> Paul Prescod writes: > > Namespaces seem to be an essential solution to two problems > > encountered when designing XML stuctures: - how can I distinguish > > my tags from everyone else's, to avoid confusion (eg: > > "<my:pastry/>"); > > Actually, this problem is very RARELY encountered. If you are > building a typical one-organization application then what are you > doing with "other people's tags" in your documents? I mean, if you > are writing typical software, it will choke and die when it comes > upon tags it does not know about. This depends on whether the information is documentation or fielded/tabular content. So far, far over half the work that Megginson Technologies is doing with XML (rather than SGML) is for data exchange rather than document production. For example, let's say that I design a record structure for information about a member of a mailing list: <member> <name>Paul Prescod</name> <email>paul@prescod.net</email> <company>ISOGEN</company> </member> Now, let's say that I get records in from other mailing lists whose maintainers include extra information that is not part of my original spec: <member> <name>Paul Prescod</name> <email>paul@prescod.net</email> <company>ISOGEN</company> <origin>Canada</origin> </member> These records are still using <member>, <email>, and <company> in the same way, but they've added something else. Now someone else might take a different approach to <origin>, since it wasn't part of my original spec: <member> <name>Paul Prescod</name> <email>paul@prescod.net</email> <company>ISOGEN</company> <origin>University of Waterloo</origin> </member> The advantages of being able to come up with globally-unique names should be obvious: <member xmlns="http://foo.com/ns/" xmlns:a="http://hack.com/ns/"> <name>Paul Prescod</name> <email>paul@prescod.net</email> <company>ISOGEN</company> <a:origin>Canada</a:origin> </member> or <member xmlns="http://foo.com/ns/" xmlns:b="http://bar.com/ns/"> <name>Paul Prescod</name> <email>paul@prescod.net</email> <company>ISOGEN</company> <b:origin>University of Waterloo</b:origin> </member> or even <member xmlns="http://foo.com/ns/" xmlns:a="http://hack.com/ns/" xmlns:b="http://bar.com/ns/"> <name>Paul Prescod</name> <email>paul@prescod.net</email> <company>ISOGEN</company> <a:origin>Canada</a:origin> <b:origin>University of Waterloo</b:origin> </member> A second major advantage of namespaces is the ability to reuse processing code. If I have written an event-handler/subroutine/method to do something useful with an HTML <table> element, then I'd like to reuse that for *every* document type that happens to use the HTML table model, even if I don't know about the document type in advance. Of course, I know that I could do all of this with architectural forms as well. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From falk at icon.at Fri Feb 5 17:06:06 1999 From: falk at icon.at (Falk, Alexander) Date: Mon Jun 7 17:08:41 2004 Subject: XML Spy v1.4 released Message-ID: <A01C76E644CAD111B83A0000E8D8890E057AB6@melange.icon.co.at> Skipped content of type multipart/alternative-------------- next part -------------- A non-text attachment was scrubbed... Name: Falk, Alexander.vcf Type: application/octet-stream Size: 1062 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990205/90325ba8/FalkAlexander.obj From paul at prescod.net Fri Feb 5 17:15:22 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:41 2004 Subject: A weaker XSL? References: <5F052F2A01FBD11184F00008C7A4A80001136B01@eukbant101.ericsson.se> <36BB1440.661A32C1@manhattanproject.com> Message-ID: <36BB2173.9CC476D7@prescod.net> Clark Evans wrote: > > I'm not suggesting that the weakend XSL woudn't do any > transformations, only that the transformations it does > be based upon a stream rather than upon an object. I have no problem with a weakened, transformational XSL for certain stream-based applications. But you seem to think that such a thing could be a replacement for XSL "as we know it." I don't believe that. I have the following problems with your two-tier model: * in general the HTML and PDF versions of a document will order and organize things differently. Online navigation is quite different from print navigation. So the part of the language that is "media specific" must be essentially as powerful as the part that is not. * if the data has ALREADY been transformed then why isn't CSS "good enough" for the second step? * people want the transformation part to take place on the client side in order to distribute the processing load. In applications where your model is sufficient, I think we already have the languages in place to MAKE it work: XSL does the transformation and CSS does simple style annotation. Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jmcdonou at library.berkeley.edu Fri Feb 5 17:21:02 1999 From: jmcdonou at library.berkeley.edu (Jerome McDonough) Date: Mon Jun 7 17:08:41 2004 Subject: Colonialism, SAX, Java, and Namespaces In-Reply-To: <36BB0D56.EDF427B0@prescod.net> References: <8025670F.002EF9AD.00@mailhost.agora.co.uk> <199902051513.KAA01453@hesketh.net> Message-ID: <3.0.5.32.19990205085933.009b1a30@library.berkeley.edu> At 09:25 AM 2/5/1999 -0600, Paul Prescod wrote: >Namespaces are an infrastructure technology. You use them THROUGH >something, like RDF. Without something like RDF they are essentially >useless. So unless you think that "RDF" is for the "average developer", >namespaces are not for the average developer. When/if something like RDF >takes off that situation may change. > I would argue, actually, that most average developers are going to have to deal with RDF. With the proliferation of information that all organizations are having to cope with, metadata to keep track of the information will be more and more essential. And RDF seems to be gaining significant mindshare as the way to store metadata among the SGML/XML crowd. If that trend continues, any developer working in medium-to-large organizations is going to be dealing with RDF, and hence, namespaces. This doesn't necessarily argue for any changes in the namespace spec, but assumptions that the average developer isn't going to have to deal with RDF (or any other metadata encoding standard) may be a bit rash. Jerome McDonough -- jmcdonou@library.Berkeley.EDU | (......) Library Systems Office, 386 Doe, U.C. Berkeley | \ * * / Berkeley, CA 94720-6000 (510) 642-5168 | \ <> / "Well, it looks easy enough...." | \ -- / SGNORMPF!!! -- From the Famous Last Words file | |||| xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Fri Feb 5 17:22:34 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:08:41 2004 Subject: Colonialism, SAX, Java, and Namespaces Message-ID: <01BE5133.156BC980@grappa.ito.tu-darmstadt.de> Paul Prescod wrote: > I used the term *dumb down* in reference to *myself*. The average > developer is a dummy in all but a few fields, just like me. We cannot > restrict all fields until they are simple enough for everyone to > understand. As someone pointed out recently, we wouldn't have a > technologically advancing civilization if we did that. If we cannot agree > on that much then further discussion is not going to be productive. I don't think Simon is asking for simpler technology. I think he's asking for more explanatory writing in the specs. Precision is almost impossible to achieve in spoken languages -- there is always somebody clever or foolish enough to "misinterpret" the most basic words -- and so the question is whether you write a short, highly formal spec, interpret it afterward, and hope that everybody hears/understands you, or write a longer, perhaps less formal spec, interpret it place, and hope you don't introduce inconsistencies and ambiguities, or go somewhere in between. Personally, I vote for the longer, slightly less formal route, as I believe it leads to wider acceptance and, in the long run, less misinterpretation. That said, I've written enough specs in my lifetime that I sympathize with anyone who writes one at all, no matter what style they choose. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simpson at polaris.net Fri Feb 5 17:22:47 1999 From: simpson at polaris.net (John E. Simpson) Date: Mon Jun 7 17:08:41 2004 Subject: When to use attributes vs. elements Message-ID: <3.0.32.19990205121500.007d2ec0@polaris.net> At 08:17 AM 2/5/99 -0800, Peter Seibel wrote: >Is there an XML philosophy (or an SGML philosophy for that matter) about >when to use attributes vs when to use elements when desigining a document >type.... For an SGML view, here's a comment (dated 28 Apr 1992) from C.M. Spielberg-McQueen: http://www.oasis-open.org/cover/attrSperberg92.html Last April this was the topic of a thread here on XML-DEV. You can check the archives, or see it all in one package here: http://www.oasis-open.org/cover/elementAttr9804.html Both references are from Robin Cover's excellent (and frequently updated) SGML/XML site, obviously located at http://www.oasis-open.org/cover/. In general, one thing to remember is that (at least for simple XML applications) attribute values are "invisible" when the document is viewed. Best, JES ==================== John E. Simpson Just XML (ISBN 0-13-943417-8) Available now from Prentice Hall PTR xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Fri Feb 5 17:23:12 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:08:41 2004 Subject: When to use attributes vs. elements Message-ID: <5BF896CAFE8DD111812400805F1991F708AAEF37@RED-MSG-08> Peter asked whether there are rules or guidelines in XML for when to use attributes versus elements. You will find a wealth of opinions on this topic. Partly this reflects the wealth of options that XML gives you and the fact that XML can be employed for many purposes. You may want to take a look at "XML Syntax Recommendation for Serializing Graphs of Data" (http://www.w3.org/TandS/QL/QL98/pp/microsoft-serializing.html) for a suggestion on how to use XML for serializing programming objects and data from databases. -Andrew Layman xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Feb 5 17:27:23 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:41 2004 Subject: Colonialism, SAX, Java, and Namespaces In-Reply-To: <36BB0D56.EDF427B0@prescod.net> References: <8025670F.002EF9AD.00@mailhost.agora.co.uk> <199902051513.KAA01453@hesketh.net> <36BB0D56.EDF427B0@prescod.net> Message-ID: <14011.9910.776880.149972@localhost.localdomain> Paul Prescod writes: > Namespaces are an infrastructure technology. You use them THROUGH > something, like RDF. Without something like RDF they are > essentially useless. So unless you think that "RDF" is for the > "average developer", namespaces are not for the average > developer. When/if something like RDF takes off that situation may > change. I (politely) disagree again -- there are many applications that can take advantages of namespaces without an RDF-like infrastructure. Here are some examples: 1. Search engines (i.e. find every mention of "Megginson" in an element named {http://www.software.com/ns/}developer). 2. Browsers (i.e. set any occurrence of {http://www.software.com/ns/}keyword in monospaced type for all document types, unless overridden by a more specific rule). 3. Localization transformations (i.e. attempt to read the contents of every element with an attribute named {http://finance.com/ns/}currency as a number and convert it to the local currency if possible). All of these (and many more) can be applied to typical human-readable documents with mixed content; they're not limited to RDF-like documents. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Fri Feb 5 17:34:28 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:08:41 2004 Subject: COBOL XML parser? In-Reply-To: <199902041412.JAA10888@hesketh.net> References: <199902041412.JAA10888@hesketh.net> Message-ID: <wkpv7ou55z.fsf@ifi.uio.no> * Simon St.Laurent | | * Lisp james anderson has written one that is distributed with the CL-HTTP web server. See <URL:http://www.ai.mit.edu/projects/iiip/doc/cl-http/home-page.html> in the contrib-directory of the distribution. I've tinkered a bit with a DTD parser in Common Lisp my spare time, but since that is close to non-existent, so are the results. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Feb 5 17:38:26 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:42 2004 Subject: SAX, Java, and Namespaces (was Re: Restricted Namespaces forXML) Message-ID: <3.0.32.19990205093621.00bd7d60@pop.intergate.bc.ca> At 08:00 AM 2/5/99 -0600, Paul Prescod wrote: >I don't think that "average developers" need to worry about namespaces. It >is quite simple to build powerful, useful applications without them. Yes, it's possible, but it seems crystal clear to me that a year or so from now, the "average XML document", were such a thing to exist, would have namespaces. Office 2000 is full of 'em. RDF is full of 'em. And if nothing else, old-fashioned document wrangling is, I predict, going to be dipping regularly into namespaces like, for example, HTML. Which is why it really is a problem that something that we tried so hard to make simple is being perceived (rightly or wrongly) as complex and intimidating. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Fri Feb 5 17:50:05 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:42 2004 Subject: Architectural Forms and Namespaces (Was: Re: SAX, Java, and Namespaces ) References: <8025670F.002EF9AD.00@mailhost.agora.co.uk> <36BAF974.6F87E679@prescod.net> <14010.65394.42870.480866@localhost.localdomain> Message-ID: <36BB2E4F.6524A575@manhattanproject.com> David Megginson wrote: > <cut> a very nice discussion of namespaces </cut> > Of course, I know that I could do all of this with > architectural forms as well. You can do _all_ of it with both? I had pictured a combination punch to solve the problem. I see namespaces and architectural forms as yet another complementary system within XML. Namespaces uniquely identify an element's structure, and archectural forms describe the mappings between these various structures. I don't see one or the other used. I see them being used in combination. What am I missing? Clark Evans -- software n : 1. written programs or procedures or rules and associated documentation pertaining to the operation of a computer system. 2. applied philosophy xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Fri Feb 5 17:52:45 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:08:42 2004 Subject: When to use attributes vs. elements In-Reply-To: <5BF896CAFE8DD111812400805F1991F708AAEF37@RED-MSG-08> Message-ID: <Pine.GHP.4.02A.9902051740360.27539-100000@mail.ilrt.bris.ac.uk> On Fri, 5 Feb 1999, Andrew Layman wrote: > Peter asked whether there are rules or guidelines in XML for when to use > attributes versus elements. > > You will find a wealth of opinions on this topic. Partly this reflects the > wealth of options that XML gives you and the fact that XML can be employed > for many purposes. > > You may want to take a look at "XML Syntax Recommendation for Serializing > Graphs of Data" > (http://www.w3.org/TandS/QL/QL98/pp/microsoft-serializing.html) for a > suggestion on how to use XML for serializing programming objects and data > from databases. > > -Andrew Layman Is this paper suggesting that the entire XML (and SGML?) community might be persuaded to serialise directed labelled graphs into XML always using this proposed canonicalised serialisation algorithm? If not, how can we tell from looking at a chunk of XML data whether they've followed this approach or followed one of the other various (explicit or implicit) graph serialisation patterns? Should we be able to tell whether this algorithm has been used without consulting/dereferencing schema declarations, ie. is there a need to propose an enclosing 'GraphSerialisation' tag of some sort so we can tell whether these rules have been used? Or some other sort of aid to interpretation...? If not, doesn't this amount to assuming (a) the we _know_ what others were thinking when they designed their serialisation algorithms, or (b) that the world can be persuaded to adopt this approach for 100% of data and document interchange. Neither fits well with the "wealth of opinions on this topic"... Dan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Fri Feb 5 17:57:02 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:08:42 2004 Subject: Colonialism, SAX, Java, and Namespaces In-Reply-To: <01BE5133.156BC980@grappa.ito.tu-darmstadt.de> Message-ID: <199902051753.MAA05051@hesketh.net> At 06:12 PM 2/5/99 +0100, Ronald Bourret wrote: >I don't think Simon is asking for simpler technology. I think he's asking >for more explanatory writing in the specs. Precision is almost impossible >to achieve in spoken languages -- there is always somebody clever or >foolish enough to "misinterpret" the most basic words -- and so the >question is whether you write a short, highly formal spec, interpret it >afterward, and hope that everybody hears/understands you, or write a >longer, perhaps less formal spec, interpret it place, and hope you don't >introduce inconsistencies and ambiguities, or go somewhere in between. Precisely. Taking the time and making the effort to ensure that specifications are clear - and not just to a small community of experts - means a lot less need for repeated explanation afterward. It makes a specification more inclusive, avoiding the need for debates over who is worthy of reading the spec. That inclusiveness can encourage more people to join the implementation process, and produce richer yields of new ideas and real source code - and less debate about non-normative sections and the impossibility of figuring out formal specifications. Extending that inclusive approach to the larger discussions also promises to have significantly more benefits than raining down comments telling developers that the specifications (and by implication, XML) really aren't meant for them, that they shouldn't be reading those things, and they certainly shouldn't be complaining about them. Being inclusive takes extra effort, but it hardly stands in the way of clear or useful standards. Simon St.Laurent XML: A Primer / Building XML Applications (March) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gkholman at CraneSoftwrights.com Fri Feb 5 17:57:01 1999 From: gkholman at CraneSoftwrights.com (G. Ken Holman) Date: Mon Jun 7 17:08:42 2004 Subject: When to use attributes vs. elements In-Reply-To: <19990205161816668.AAA169@ashbury.weblogic.com@lawton> Message-ID: <Version.32.19990205124315.00ea8690@CraneSoftwrights.com> At 99/02/05 08:17 -0800, Peter Seibel wrote: >Is there an XML philosophy (or an SGML philosophy for that matter) about >when to use attributes vs when to use elements when desigining a document >type. >... >But in a case like this is there some principle that would give some guidence? Philosphical? No. Religious? Yes. :{)} Some people fervently *believe* in one way while other *believe* in the other way. Here is a public repository of opinion: http://www.oasis-open.org/cover/elementsAndAttrs.html .......... Ken -- G. Ken Holman mailto:gkholman@CraneSoftwrights.com Crane Softwrights Ltd. http://www.CraneSoftwrights.com/x/ Box 266, V: +1(613)489-0999 Kars, Ontario CANADA K0A-2E0 F: +1(613)489-0995 Training: http://www.CraneSoftwrights.com/x/schedule.htm Resources: http://www.CraneSoftwrights.com/x/resources.htm Shareware: http://www.CraneSoftwrights.com/x/shareware.htm Next XSL Training: X-Tech:1999-03-07 WWW8:1999-05-11 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Fri Feb 5 18:02:16 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:42 2004 Subject: Colonialism, SAX, Java, and Namespaces Message-ID: <A26F84C9D8EDD111A102006097C4CD0D05496B@SOHOS002> Simon St.Laurent wrote: > I get the feeling that some of the writers on this list - though > _certainly_ not all - view the 'average developer' as some kind of > primitive creature that should be shunted aside in the name > of progress. Why would anyone shunt aside 'average developers'? Where else do the truly great developers come from, but the large pool of average developers? However, if all we ever talk about is what can be understood by the most number of people, then we do *not* progress. Do you really want to keep the useful discussion about how and whether to store XML files in a database or not to be constantly concerned with how many people can understand and contribute? Sure, you want to be as helpful as you can to others who are learning, but also you sometimes want to have high-level discussions with your peers (not to say we're all 'truly great' either - just 'above average'). > This colonialist view (I don't know what else to compare it > to - simple > elitism seems inadequate) has contributed to the development > of a lot of > tools that people talk about but very few people understand. I'm not sure which colonialism you're thinking of, but the one I read about involved dominating foreign lands, taking their cash and selling the occupants as slaves. Next you'll be talking about holocausts - rounding up 'average developers' and the like. Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Fri Feb 5 18:04:47 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:42 2004 Subject: Namespaces for "average programmers" Message-ID: <36BB2E8E.61D4902F@prescod.net> LeT me try again with this namespaces characterization. You can't do anything useful with markup unless you have two things: a schema language to enforce constraints and a processing model to *do something*. If you are truly an average developer, then you don't want to invent a schema language or processing model. So you must use one that exists. There are only two standards-track schema languages: DTDs and RDF schemas. DTDs do not know anything about namespaces. Therefore you do not need to know anything about namespaces to use DTDs. RDF supports namespaces. But RDF's use of namespaces is mostly documented in RDF itself. "Best practices" for namespace usage in RDF schemas are specific to RDF. Anyhow I think that it is debatable whether average programmers use RDF. On the procesing side, XSL uses namespaces but anybody can figure out how to use namespaces in RDF just by reading the XSL specification (or a book/article on XSL). Once again, you don't have to become an expert on the namespaces specification to use XSL. Plus XSL has essentially no support for namespaces in the documents it proceses. One day it might, but again the best practices for namespace usage relative to XSL will arise at that point. In other words, it will be at least a year before the infrastructure for using arbitary namespaces will be available. And using the fixed namespaces we have today (xml:, xsl:, fo:) is pretty much a no-brainer. People on the xsl-list seem to have no problem with it. In other words, don't worry be happy. The sky is not falling. (it is interesting to note that XSL gets away with using namespaces because there is no schema language that defines it. RDF gets away with using namespaces because there is no RDF processing model. Presumably every RDF application defines a processing model from scratch.) Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jmg at trivida.com Fri Feb 5 18:15:20 1999 From: jmg at trivida.com (Jeff Greif) Date: Mon Jun 7 17:08:42 2004 Subject: Slowness of JDK 1.1.x String.intern() [was Re: SAX, Java, and Namespaces ] Message-ID: <030601be5133$166851f0$a24630d1@greif.trivida.com> JDK 1.1.7 intern is native, but is slow because it first converts the characters in the string from unicode to UTF-8 in a freshly malloc'ed buffer. The buffer is later freed if the string is already interned. It is stored in the string table along with the String if not already present. It would be much better if a fixed-sized buffer for small strings and an alloca()'ed or malloc'ed buffer for larger ones were used, since the lookup operation has the option of copying the string if it needs to be inserted. I've sent Sun a revised version, but don't know whether or when it will be used. Jeff -----Original Message----- From: Tyler Baker <tyler@infinet.com> To: Tim Bray <tbray@textuality.com> Cc: David Megginson <david@megginson.com>; xml-dev@ic.ac.uk <xml-dev@ic.ac.uk> Date: Thursday, February 04, 1999 5:48 PM Subject: Re: SAX, Java, and Namespaces (was Re: Restricted Namespacesfor XML) ... earlier stuff snipped ... > >As I said before things have improved. intern() is now native so there is really no excuse >that I can think of why it should still be slow (it is not as slow as it used to be but >calling it has roughly half the cost of calling new() now). Nevertheless, the String class >should of had a static intern() method a long time ago that accepts a character array. Boy >would it have been convenient... > >Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Feb 5 18:21:13 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:42 2004 Subject: Slowness of JDK 1.1.x String.intern() [was Re: SAX, Java, and Namespaces ] Message-ID: <3.0.32.19990205101910.00bdf210@pop.intergate.bc.ca> At 10:12 AM 2/5/99 -0800, Jeff Greif wrote: >JDK 1.1.7 intern is native, but is slow because it first converts the >characters in the string Actually, the real reason that most XML parsers will *never* use built-in intern is because they probably have the name available in a character array, and can go look things up in the handcrafted table without String-i-fying it - thus skipping several steps of work that a built-in intern is going to have to do. E.g. Lark's symbol table is a double array, storing both the character-array and String version of each name - you lookup based on the character array and return the string if it's already there. The point is that you call new String() only once per unique name. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Fri Feb 5 18:32:24 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:08:42 2004 Subject: CORBA's not boring yet. / XML in an OS? In-Reply-To: <3.0.6.32.19990205050217.00f49460@scripting.com> Message-ID: <000301be5135$7c9af840$5118a8c0@kuantech1.quokka.com> If you have Netscape Navigator 4 on your machine, then IIOP runs on your machine. -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of Dave Winer Sent: Friday, February 05, 1999 5:02 AM To: xml-dev@ic.ac.uk Subject: RE: CORBA's not boring yet. / XML in an OS? >>Isn't RPC using IIOP and DCOM already slow enough without using XML? Why don't MS just support IIOP? I don't know why MS does anything, I don't work there. I know why I like XML-RPC. It's simple. You can write a client in a few hours, and a server in a couple of days. A JavaScript programmer can learn how to do it, today, in less than 24 hours, and in a few weeks, they'll be able to do it in minutes. I have some theories about why this is true: 1. HTTP is everywhere. Does IIOP run on my machine? I'm pretty sure it doesn't. And it's not just about Microsoft, I have Macs too. 2. XML looks like HTML. To someone who has mastered HTML the transition is easy. Performance matters, for sure. But sometimes people overlook that people performance is probably the single most important limiting factor. I can tell you this, CORBA wasn't designed for my mind. It's so complicated, so many new concepts to understand. I even had trouble understanding HTML way back when. Wire protocols can be optimized, but people's brains move at their own rate, rejecting things that appear too complicated, waiting for something that makes sense to them. We're all busy! The problem with COM, CORBA and Apple Events is that each of them were invented before the web exploded and are quite platform specific, and are not understandable to people who do web development. Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From avirr at LanMinds.Com Fri Feb 5 18:36:14 1999 From: avirr at LanMinds.Com (Avi Rappoport) Date: Mon Jun 7 17:08:42 2004 Subject: Colonialism, SAX, Java, and Namespaces In-Reply-To: <199902051753.MAA05051@hesketh.net> References: <01BE5133.156BC980@grappa.ito.tu-darmstadt.de> Message-ID: <v04104404b2e0e79f3670@[207.33.50.55]> At 12:56 PM -0500 2/5/99, Simon St.Laurent wrote: > Precisely. Taking the time and making the effort to ensure that > specifications are clear - and not just to a small community of experts - > means a lot less need for repeated explanation afterward. It makes a > specification more inclusive, avoiding the need for debates over who is > worthy of reading the spec. I agree, and have found that the best way to be truly clear is to provide as many examples as possible. The one caveat is that almost every beginner will follow those examples slavishly, so they'd better be well-designed and applicable to a wide variety of situations. There's a nice opportunity for an "XML Namespaces By Example" book and/or site, which could provide an excellent supplement to the spec. If, in fact, the examples really do follow the spec... Perhaps the best way to get started is to encourage people to post their early designs and drafts to the list for constructive comments. That way, those of us who are having a hard time grokking the process can learn from the public analysis. Avi ________________________________________________________________ Avi Rappoport, Web Site Search Tools Maven: <mailto:avirr@lanminds.com> Guide to Site Indexing and Local Search Engines: <http://www.searchtools.com> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at dns.isogen.com Fri Feb 5 18:51:12 1999 From: eliot at dns.isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:08:42 2004 Subject: Architectural Forms and Namespaces (Was: Re: SAX, Java, and Namespaces ) In-Reply-To: <36BB2E4F.6524A575@manhattanproject.com> References: <8025670F.002EF9AD.00@mailhost.agora.co.uk> <36BAF974.6F87E679@prescod.net> <14010.65394.42870.480866@localhost.localdomain> Message-ID: <3.0.5.32.19990205125207.00ab6940@amati.techno.com> At 05:45 PM 2/5/99 +0000, Clark Evans wrote: >I don't see one or the other used. I see them >being used in combination. What am I missing? It was true of the original namespace approach (but may not be with the latest), that anything you could say with namespaces you could also say with architectural mappings (that is, binding an element or attribute instance to a globally-unique name). The reverse was not (and is not) true: there are many things you can say with architectures that you cannot say with name spaces. Name spaces and architectures were (and hopefully still are) complementary at least to the degree that the use of one did not interefere with use of the other and the two could be used in combination in various clever ways. Note that one of the key things that architectures provide that name spaces do not is an explicit definition of how to validate a document against the syntactic requirements of the architecture--this is because the architecture mechanism uses DTD as its formal definition, thus any document can be validated against its architectural DTD. In addition, the architecture is, by definition of the Architecture standard, a definition of the *semantics* of the architecture (whatever that might mean and however they might be defined), not just the definition of a vocabulary of element types and attributes. This is a subtle but important distinction. Note that any vocabulary of names used as a name space can also be used as an architecture. Cheers, E. -- <Address HyTime=bibloc> W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com </Address> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Fri Feb 5 18:56:56 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:42 2004 Subject: Colonialism, SAX, Java, and Namespaces References: <8025670F.002EF9AD.00@mailhost.agora.co.uk> <199902051513.KAA01453@hesketh.net> <36BB0D56.EDF427B0@prescod.net> <14011.9910.776880.149972@localhost.localdomain> Message-ID: <36BB3BC8.F1476D09@prescod.net> David Megginson wrote: > > > When/if something like RDF takes off that situation may > > change. > > I (politely) disagree again -- there are many applications that can > take advantages of namespaces without an RDF-like infrastructure. I didn't mean a metadata infrastructure like RDF. I meant that we namespaces need to be put in a lot more *context* before average programmers should worry about them. In the abstract they will be complicated because questions like "what does an unqualified attribute mean" and "what are best practices" are best answered *in the context of concrete applications*. XSL uses namespaces and nobody finds them confusing. Because XSL provides an intuitive context and interface to the namespace concept. But take that concept out of context and people will get confused. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Fri Feb 5 18:58:37 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:42 2004 Subject: SAX, Java, and Namespaces (was Re: Restricted NamespacesforXML) References: <3.0.32.19990205093621.00bd7d60@pop.intergate.bc.ca> Message-ID: <36BB3AED.8836CF36@prescod.net> Tim Bray wrote: > > Yes, it's possible, but it seems crystal clear to me that a year or > so from now, the "average XML document", were such a thing to exist, > would have namespaces. Office 2000 is full of 'em. I understand that but I do not think that the consumers of those documents (whether end-users or "average programmers") will need to go back to the XML Names specification to understand them. > RDF is full of > 'em. And if nothing else, old-fashioned document wrangling is, I > predict, going to be dipping regularly into namespaces like, for > example, HTML. The average HTML user is not going to go to the namespaces specification any more than they went to the SGML specification. Nor should they. > Which is why it really is a problem that something that we tried so > hard to make simple is being perceived (rightly or wrongly) as > complex and intimidating. -Tim Namespaces are intimidating for exactly the same reason that things like XLink and architectural forms are intimidating. Namespaces are abstract. But when you apply them to a spec, like Office 2000, or HTML, they become simple to understand. In a particular context, they are simple. That's why "average developers" should sit back and wait for the context to develop. Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Fri Feb 5 19:51:55 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:42 2004 Subject: Namespaces does *not* formally introduce "global attributes" References: <A26F84C9D8EDD111A102006097C4CD0D054968@SOHOS002> Message-ID: <36BB4B8B.2C33BC9F@infinet.com> Mark Birbeck wrote: > > Additionally, implementers would have been able to easily add > > a namespace > > processing module on top of their current XML parsers (a SAX namespace > > expansion filter, for example, is trivial when implemented this way), > > _without changing the interfaces_. Future implementations > > might use better > > interfaces - such as APIs for accessing just the "namespace > > part" or the > > "local part" of an expanded name - but the point is every XML > > application > > would go on working as it is, without any changes. > > Reminds me of a question I had a while back: what happens to a perfectly > acceptable XML 1.0 document run through an XML parser which has a > namespace processing module? This is, after all, valid XML 1.0: > > <this:is:my:good this:is:an:attribute:called:a1="1" /> > > (As is: > > <:::: :::="1" /> > > ) Good point. > In terms of the old document run through the new parser, as far as > namespaces go this should be no different to: > > <good a1="1" /> > > But in the new parsers it will be an error, because, as the spec says, > "The namespace prefix, unless it is xml or xmlns, must have been > declared ..." > > It seems that XML namespaces are not backwards compatible with 'old' > documents. If this is true, is it explicitly justified anywhere? I > haven't come across it. Perhaps it is the intention of the spec that a > 'non-conformant' document (i.e., more than one colon in names, etc.) > simply 'drops back' to XML 1.0, rather than being 'failed' by the > namespace processor. But this then means you couldn't merge two DTDs in > a document - one built with namespaces in mind, and one not. One idea would be to have a processing instruction in the prolog of the document which tells the XML Parser whether namespaces processing should be turned on or not before parsing of the body begins. <?xml:namespaces status="on"?> XML Parser which cannot process XML namespaces would then either throw an error or at least give a warning. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Fri Feb 5 20:13:22 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:43 2004 Subject: Namespaces does *not* formally introduce "global attributes" Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054972@SOHOS002> Tyler Baker wrote: > One idea would be to have a processing instruction in the > prolog of the document which tells > the XML Parser whether namespaces processing should be turned > on or not before parsing of the > body begins. > > <?xml:namespaces status="on"?> > > XML Parser which cannot process XML namespaces would then > either throw an error or at least > give a warning. Except that the problem may be that a document uses two DTDs, one ns compliant and one not. That's why I came to the conclusion that nodes should conform, rather than entire documents. Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Fri Feb 5 20:55:09 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:43 2004 Subject: Slowness of JDK 1.1.x String.intern() [was Re: SAX, Java,and Namespaces ] References: <3.0.32.19990205101910.00bdf210@pop.intergate.bc.ca> Message-ID: <36BB59FA.FF612D99@infinet.com> Tim Bray wrote: > At 10:12 AM 2/5/99 -0800, Jeff Greif wrote: > >JDK 1.1.7 intern is native, but is slow because it first converts the > >characters in the string > > Actually, the real reason that most XML parsers will *never* use > built-in intern is because they probably have the name available in a > character array, and can go look things up in the handcrafted > table without String-i-fying it - thus skipping several steps > of work that a built-in intern is going to have to do. E.g. Lark's > symbol table is a double array, storing both the character-array > and String version of each name - you lookup based on the > character array and return the string if it's already there. The > point is that you call new String() only once per unique name. I do pretty much the exact same thing.except on each call to new String() I do something of the form: new String().intern(). This way at the application level that for element names and attribute names you can test for identity instead of equality. Since you can't exactly do something like this in any programming language I know of: String s = new String("foo"); switch (s) { case "foo": case "bar": } You need to write code like this: if (s.equals("foo")) { } else if (s.equals("bar)) { } etc. In cases where the most likely scenario is testing for equality of a lot of strings and then executing a default action as in the case of an else statement, this can get expensive. Even though calling String.intern() has a one time cost for the first occurrence of an element or attribute name, repeatedly calling String.equals() can be quite expensive too. Code of the form: if (s == "foo") else if (s == "bar") is about as fast as an integer compare and even though you may take a small performance hit at the parser level (or DOM level) in the general case you will be improving things at the application level even if you use String.equals() since the String.equals() method is of the form: public boolean equals(Object o) { if (this == o) { return true; } // Do other string comparing code } Nevertheless, the String.intern() method has a poor implementation under the hood. I don't know what kind of table the JDK is using under the hood for each JVM, but whatever implementation SUN is using is pretty lame. But despite the poor implementation of String.intern(), it is still a win at the application level to be dealing with Names that are represented as interned strings. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Fri Feb 5 21:12:40 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:43 2004 Subject: Colonialism, SAX, Java, and Namespaces References: <8025670F.002EF9AD.00@mailhost.agora.co.uk> <199902051513.KAA01453@hesketh.net> <36BB0D56.EDF427B0@prescod.net> <14011.9910.776880.149972@localhost.localdomain> <36BB3BC8.F1476D09@prescod.net> Message-ID: <36BB5E85.E17C557F@infinet.com> Paul Prescod wrote: > XSL uses namespaces and nobody finds them confusing. Because XSL provides > an intuitive context and interface to the namespace concept. But take that > concept out of context and people will get confused. I have seen one XSL Stylesheet example posted by a newbie that dealt with namespaces. The fact that no one is using this supplement to XML called "Namespaces in XML" can be construed that namespaces are either too complicated or too usesless to be of any utility to these "average developers". Yah, I may be just one of the many people here complaining about namespaces, but would you rather deal with people who take a look at "Namespaces in XML" and say "what the heck is this mumbo jumbo" and ignore namespaces altogether. Any specification or so-called standard that does not achieve some level of concensus among its intended audience is nothing more than a glorified document with a bunch of "expert" names on them. Unless the W3C is in the business of wasting the time and money of its membership, I suspect that someone in the organization will be sensitive to the some of the "Namespaces in XML" concerns of this budding XML developer community. If the W3C simply ignores everyone, then everyone will eventually ignore the W3C. It is that simple. Tyler BTW, I would probably not be wasting so much of my time on this whole "Namespaces in XML" issue if the entire "Namespaces in XML" spec were not incorporated into the XSL draft (something I care much more about). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Fri Feb 5 21:16:58 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:08:43 2004 Subject: When to use attributes vs. elements Message-ID: <5BF896CAFE8DD111812400805F1991F708AAEF3C@RED-MSG-08> Thank you. Dan asks a reasonable question, which is whether a document that uses the conventions described in http://www.w3.org/TandS/QL/QL98/pp/microsoft-serializing.html needs to signal somehow that these conventions are in play. In case of the "canonical format" I proposed, however, I don't think special signalling is necessary: The proposal does not add any new interpretations to the use of elements or attributes beyond what can be described in a DTD or a schema such as XML-Data or DCD. Elements, attributes, ids and idrefs are carefully used so that their normal XML interpretation matches the scoping and linking rules of object graphs or relational databases. In a general case, if conventions add rules for interpretation above what is in the structure of a document or above what can be expressed in a DTD, then this would need to be somehow signalled in order for a reader to process the document. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Sat Feb 6 10:52:51 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:43 2004 Subject: Colonialism, SAX, Java, and Namespaces References: <8025670F.002EF9AD.00@mailhost.agora.co.uk> <199902051513.KAA01453@hesketh.net> <36BB0D56.EDF427B0@prescod.net> <14011.9910.776880.149972@localhost.localdomain> <36BB3BC8.F1476D09@prescod.net> <36BB5E85.E17C557F@infinet.com> Message-ID: <36BC1A9C.76AF9AAF@prescod.net> Tyler Baker wrote: > > I have seen one XSL Stylesheet example posted by a newbie that dealt > with namespaces. The fact that no one is using this supplement to XML > called "Namespaces in XML" can be construed that namespaces are either > too complicated or too usesless to be of any utility to these > "average developers". Some specifications are meant to be directly used by end-users: CSS. Some specifications are meant to be used by other standardizers: DTDs. Namespaces are being used in RDF, XSL, and Voyager. "Average developers" will use namespaces through those technologies and others. I do not buy your thesis that the fact that people are not yet using namespaces demonstrates that namespaces are a failure. XSL and Voyager are not standardized yet and RDF is itself quite immature in its implementations. I also do not buy the thesis that it elitist to recommend that "average developers" not waste their time learning an abstraction until infrastructure and context becomes available in order to use it. There are many technologies that I expect will be relevant to average developers at some point in the future but are not now. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Sat Feb 6 14:58:05 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:08:43 2004 Subject: Colonialism, SAX, Java, and Namespaces In-Reply-To: <36BC1A9C.76AF9AAF@prescod.net> References: <8025670F.002EF9AD.00@mailhost.agora.co.uk> <199902051513.KAA01453@hesketh.net> <36BB0D56.EDF427B0@prescod.net> <14011.9910.776880.149972@localhost.localdomain> <36BB3BC8.F1476D09@prescod.net> <36BB5E85.E17C557F@infinet.com> Message-ID: <199902061457.JAA21847@hesketh.net> At 04:34 AM 2/6/99 -0600, Paul Prescod wrote: >I also do not buy the thesis that it elitist to recommend that "average >developers" not waste their time learning an abstraction until >infrastructure and context becomes available in order to use it. There are >many technologies that I expect will be relevant to average developers at >some point in the future but are not now. Well, I guess we'll see what the average developers do, and how they respond to such attitudes and their results in the specs. Speaking for myself as an 'average developer', I find this view infuriating, and a poor excuse to avoid the extra effort needed to make specs more immediately usable and comprehensible. On the other hand, maybe some developers don't care and it may in the long run have no significant impact. We've been over this too many times, so I'll end here with a plea to spec developers and their explainers. Make your specifications as comprehensible as you can to as wide an audience as you can, so that all of us can spend more time writing implentations and less time debating what the specs mean. Simon St.Laurent XML: A Primer / Building XML Applications (March) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Sat Feb 6 16:03:43 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:08:43 2004 Subject: What Clean Specs Achieve, WAS: Colonialism, SAX, Java, and Namespaces Message-ID: <000901be51e9$a5dff3e0$c9a8a8c0@thing2> One of the big advantages of Java is that a small shop can tackle significant projects. With clean specs, the same will be true for XML. It is worth the extra effort to keep the specs as clean as possible. It goes beyond wide-acceptance. It means smaller project teams and shorter delivery times. And that makes XML a competative advantage. Bill From: Simon St.Laurent <simonstl@simonstl.com> >... Make your specifications as >comprehensible as you can to as wide an audience as you can, so that all of >us can spend more time writing implentations and less time debating what >the specs mean. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Sat Feb 6 16:11:09 1999 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:08:43 2004 Subject: Colonialism, SAX, Java, and Namespaces Message-ID: <015601be51eb$2e3d8260$82aedccf@ix.netcom.com> Simon wrote, >>We've been over this too many times, so I'll end here with a plea to spec developers and their explainers. Make your specifications as comprehensible as you can to as wide an audience as you can, so that all of us can spend more time writing implentations and less time debating what the specs mean.<< I would second that. It should be just as easy to write clear English as it is to write gobblydy-gook. There is absolutely no reason why a spec. cannot be both precise and understandable. As no less a person than Albert Einstein said, "If you can't explain a proposition to an intelligent layman the proposition is probably flawed" . Specs have both 'Normative' and 'informitive' parts to them. A little more work on the informtive parts would be very useful. Frank Frank Boumphrey XML and style sheet info at Http://www.hypermedic.com/style/index.htm Author: - Professional Style Sheets for HTML and XML http://www.wrox.com CoAuthor: XML applications from Wrox Press, www.wrox.com Author: Using XML on the Web (March) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Sat Feb 6 17:40:50 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:08:43 2004 Subject: Colonialism, SAX, Java, and Namespaces Message-ID: <001001be51f7$374b3440$c9a8a8c0@thing2> From: Frank Boumphrey <bckman@ix.netcom.com> >I would second that. It should be just as easy to write clear English as it >is to write gobblydy-gook. My own experience is that clarity requires considerable effort, but its worth it. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Sat Feb 6 17:45:22 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:43 2004 Subject: What Clean Specs Achieve, WAS: Colonialism, SAX, Java, and Na mespaces Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054976@SOHOS002> Bill la Forge wrote: > One of the big advantages of Java is that a small shop can > tackle significant projects. With clean specs, the same will > be true for XML. > Hands up, who has read the Java spec (and that's not the same as reading the nice clear instructions given to you by the people who wrote the compiler)? Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Sat Feb 6 17:55:49 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:08:43 2004 Subject: What Clean Specs Achieve In-Reply-To: <A26F84C9D8EDD111A102006097C4CD0D054976@SOHOS002> Message-ID: <199902061755.MAA23664@hesketh.net> At 05:53 PM 2/6/99 +0000, Mark Birbeck wrote: >Hands up, who has read the Java spec (and that's not the same as reading >the nice clear instructions given to you by the people who wrote the >compiler)? But is anyone here trying to _implement_ Java? Lots of folks here are indeed trying to _implement_ XML 1.0 (parsers and SAX), XLink and XPointer, Namespaces, XSL, etc. It's not like we're only trying to _use_ them, as is the case with Java (or SQL, another example that's been bounced around.) Simon St.Laurent XML: A Primer / Building XML Applications (March) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Sat Feb 6 17:58:17 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:08:43 2004 Subject: COBOL XML parser? References: <199902041412.JAA10888@hesketh.net> <wkpv7ou55z.fsf@ifi.uio.no> Message-ID: <36BC83CE.39823C2@mecomnet.de> the version in the CL-HTTP release includes a dom, a dtd parser, and validation. Lars Marius Garshol wrote: > > ... See > > <URL:http://www.ai.mit.edu/projects/iiip/doc/cl-http/home-page.html> > > in the contrib-directory of the distribution. > > I've tinkered a bit with a DTD parser in Common Lisp my spare time, > but since that is close to non-existent, so are the results. > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sat Feb 6 18:08:02 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:43 2004 Subject: What Clean Specs Achieve Message-ID: <3.0.32.19990206100726.00c02210@pop.intergate.bc.ca> At 12:58 PM 2/6/99 -0500, Simon St.Laurent wrote: > >But is anyone here trying to _implement_ Java? Lots of folks here are >indeed trying to _implement_ XML 1.0 (parsers and SAX), XLink and XPointer, >Namespaces, XSL, etc. It's not like we're only trying to _use_ them, as is >the case with Java (or SQL, another example that's been bounced around.) Most of them seem to be succeeding. What should we conclude? -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Sat Feb 6 18:10:02 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:43 2004 Subject: Namespace Applications References: <8025670F.002EF9AD.00@mailhost.agora.co.uk> <36BAF974.6F87E679@prescod.net> <14010.65394.42870.480866@localhost.localdomain> Message-ID: <36BC80C6.B2950AF6@prescod.net> David Megginson wrote: > <email>paul@prescod.net</email> > <company>ISOGEN</company> > <a:origin>Canada</a:origin> > <b:origin>University of Waterloo</b:origin> > </member> > > ... > > The advantages of being able to come up with globally-unique names > should be obvious: Actually it isn't to me. The problem is now you have <a:origin> and <b:origin> element types but you don't know what to do with them. This is the point I keep harping about: processing expectations. Clearly <a:origin> is supposed to be mapped either to nothing or to <david:CountryOfOrigin> and <b:origin> is to be mapped either to nothing or to <david:GraduatedFrom>. It seems to me that information should not be let into my information system until it is expressed in terms that my information system is familiar with. What that means is that these things should be shipped with either architectural declarations or an XSL stylesheet that lets me locally reinterpret them. If all you want to do is make unknown elements "disappear" you can do that without namespaces also. > A second major advantage of namespaces is the ability to reuse > processing code. If I have written an event-handler/subroutine/method > to do something useful with an HTML <table> element, then I'd like to > reuse that for *every* document type that happens to use the HTML > table model, even if I don't know about the document type in advance. I can think of a variety of non-namespace ways to handle this (including the one you pointed out). Maybe I'm over-conservative but I will not advise my customers to depend on the namespace mechanism until there are facilities for validating and processing them intelligently. I mean even the most XSL-sophisticated XML editor/formatter would not recognize your namespace-prefixed HTML element if you changed the prefix because XSL itself does not handle it. I mean there is leading edge and there is bleeding edge.... -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Sat Feb 6 18:18:18 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:08:43 2004 Subject: What Clean Specs Achieve In-Reply-To: <3.0.32.19990206100726.00c02210@pop.intergate.bc.ca> Message-ID: <199902061818.NAA23905@hesketh.net> At 10:07 AM 2/6/99 -0800, Tim Bray wrote: >At 12:58 PM 2/6/99 -0500, Simon St.Laurent wrote: >> >>But is anyone here trying to _implement_ Java? Lots of folks here are >>indeed trying to _implement_ XML 1.0 (parsers and SAX), XLink and XPointer, >>Namespaces, XSL, etc. It's not like we're only trying to _use_ them, as is >>the case with Java (or SQL, another example that's been bounced around.) > >Most of them seem to be succeeding. What should we conclude? -Tim Most people who don't succeed, don't announce. We can't conclude anything. Judging from the volume of questions (and controversy) on this and its sibling lists (XSL-list, xlxp-dev), there's a lot of improvement that could be made. Simon St.Laurent XML: A Primer / Building XML Applications (March) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Sat Feb 6 18:36:09 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:43 2004 Subject: What Clean Specs Achieve Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054977@SOHOS002> Simon St.Laurent wrote: > At 05:53 PM 2/6/99 +0000, Mark Birbeck wrote: > >Hands up, who has read the Java spec (and that's not the > same as reading > >the nice clear instructions given to you by the people who wrote the > >compiler)? > > But is anyone here trying to _implement_ Java? Lots of folks here are > indeed trying to _implement_ XML 1.0 (parsers and SAX), XLink > and XPointer, > Namespaces, XSL, etc. It's not like we're only trying to > _use_ them, as is > the case with Java (or SQL, another example that's been > bounced around.) And that's the point! If you want to write a Java compiler then get down with all the specs, as well as current theory on compiler writing, grammars, languages, OO, and so on - because you're going to need it! And if you want to write an XML parser, or XSL transformer, or your own DOM then sure, get with the nitty-gritty of the specifications, but you better also get clued up on language theory - I've seen Umberto Eco quoted in some places! - meta-information, mark-up languages, and all the rest of it. But don't tell me that someone using Office 2000 to write a letter to their bank manager needs to understand namespaces. And that is not elitist, colonialist or patronising - I credit people with more intelligence than wanting to understand quantum physics before they switch the TV on. The truth is that if people want to be at the leading edge of thought in *any* discipline, then they better get used to the idea that nothing worth understanding is ever easy. If it was, it would be 'common sense' and therefore nothing new. If someone really wants to write their own parser and they are having trouble understanding namespaces, they should seriously ask if they are ready for such an undertaking. As I keep saying, I'm not arguing for specs that are *more* difficult to understand - it's not exactly the most profound utterance to say 'clearer is better'. But at the same time I personally don't immediately try to blame someone else if I don't understand something, and I particularly don't think anyone *owes* me anything. If the spec writers are good enough to spare some time and answer some of my questions I am very grateful, but it is *not* their obligation. The reality is that I have already saved hundreds of hours of work for our company by using XML. The hours and hours I spent last year, reading and re-reading, trying to understand the implications of it all, have been recovered many, many times over by the speed with which we are now able to develop web sites with our new tools. I think I have more than had my money's worth from the 'gobbledy-gook' the spec writers have produced, and my suspicion is that many people out there have too. Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From begeddov at jfinity.com Sat Feb 6 19:03:14 1999 From: begeddov at jfinity.com (Gabe Beged-Dov) Date: Mon Jun 7 17:08:43 2004 Subject: Namespace Applications References: <8025670F.002EF9AD.00@mailhost.agora.co.uk> <36BAF974.6F87E679@prescod.net> <14010.65394.42870.480866@localhost.localdomain> <36BC80C6.B2950AF6@prescod.net> Message-ID: <36BC91A8.73022A96@jfinity.com> Paul Prescod wrote: > David Megginson wrote: > > ... > > > > The advantages of being able to come up with globally-unique names > > should be obvious: > > Actually it isn't to me. The problem is now you have <a:origin> and > <b:origin> element types but you don't know what to do with them. Naming something doesn't equate to being able to process it. As long as I can identify something I can always process it later, once I (or someone else) know more. Early vs. Late binding. Many "processing" scenarios are only concerned with forwarding data. An analogy is mail transfer agents and envelope vs. contents. The namespace qualified element name is the address. In David's example, there are two "origin" names. If they aren't qualified by the namespace, they won't be able to be delivered correctly. Its still up to the recipient to figure out what to do with the contents of the element once delivered. The recipient might be quite a few "hops" aways from the sender. Giving something a unique name is an end in and of itself. You may only find something useful to do with it further down the timeline or processing pipeline. Late binding is a GOOD thing as long as the late bound agent gets all of the data needed to "process". Gabe Beged-Dov www.jfinity.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Sat Feb 6 19:05:40 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:08:44 2004 Subject: What Clean Specs Achieve References: <199902061755.MAA23664@hesketh.net> Message-ID: <36BC93A3.2544EAC@mecomnet.de> Mark Birbeck wrote: >Hands up, who has read the Java spec (and that's not the same as reading >the nice clear instructions given to you by the people who wrote the >compiler)? i spent more time with the vm spec than with the language spec. i was more interested as to whether i could ever hope to be afforded things like closures and generic functions, the answer for which was to be found more likely in whether the vm precluded them, than in what the language designers thought of them. ... xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Sat Feb 6 19:25:30 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:08:44 2004 Subject: What Clean Specs Achieve In-Reply-To: <A26F84C9D8EDD111A102006097C4CD0D054977@SOHOS002> Message-ID: <199902061925.OAA24843@hesketh.net> At 06:44 PM 2/6/99 +0000, Mark Birbeck wrote: >But don't tell me that someone using Office 2000 to write a letter to >their bank manager needs to understand namespaces. You're right. They don't need to know, and most likely, they don't want to. Unfortunately, lots of people between that user and the namespace spec do need to know how to provide reliable and efficient namespace-aware processing. I'd like to think that you don't need to have Microsoft's resources to process a specification and produce reliable and efficient software. XML - even namespaces - isn't exactly rocket science. If you're willing to hit your head against the specs for a few months, which I do for a living, it becomes clear that none of this stuff is really as psychotically complicated as it seems at first. The cost of hitting your head against the specs can be prohibitive, however, if you're a small organization without extensive resources. More complex specs mean that you have to spend more resources comprehending those specs, wasting time that could have been better spent coding. Writing clean specs, clear specs, intelligible specs, helps everyone who needs to implement them, as well as the few users who straggle all the way to the specs to find out exactly why things work the way they do. It may not have a direct impact on the Office 2000 user, but it certainly could affect the choices they have among tools for processing and managing those documents beyond the Office 2000 software itself. Simon St.Laurent XML: A Primer / Building XML Applications (March) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Sat Feb 6 19:54:57 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:44 2004 Subject: What Clean Specs Achieve References: <A26F84C9D8EDD111A102006097C4CD0D054977@SOHOS002> Message-ID: <36BC9DCA.C82CCC03@infinet.com> Mark Birbeck wrote: > But don't tell me that someone using Office 2000 to write a letter to > their bank manager needs to understand namespaces. And that is not > elitist, colonialist or patronising - I credit people with more > intelligence than wanting to understand quantum physics before they > switch the TV on. The truth is that if people want to be at the leading > edge of thought in *any* discipline, then they better get used to the > idea that nothing worth understanding is ever easy. If it was, it would > be 'common sense' and therefore nothing new. If someone really wants to > write their own parser and they are having trouble understanding > namespaces, they should seriously ask if they are ready for such an > undertaking. I don't consider XML even with namespaces to be anything revolutionary or bleeding edge. XML is supposedly a "standards" effort at creating a simple markup language for the web, not some technology exploration. This whole "Namespaces in XML" stuff seems unfortunately to be a technology exploration. Instead of trying something simple, the W3C decided to create something totally new. Now I am all for creativity and technical exploration, but certainly not in a standards effort. If XML is not going to be simple, why use XML at all when there supposedly are much more powerful and well-established standardized alternatives like SGML in existence that get the job done. Why should XML be just another reinvention of the wheel. I mean come on, markup should not be rocket science folks. I could create my own markup language in a small amount of time with all kinds of features, but it would not be standardized as many people would likely not agree with some of my ideas. So that is what standards are about: simplicity and concensus. Anything less and it is not a standard but a glorified document with lots of "expert" names on it. > As I keep saying, I'm not arguing for specs that are *more* difficult to > understand - it's not exactly the most profound utterance to say > 'clearer is better'. But at the same time I personally don't immediately > try to blame someone else if I don't understand something, and I > particularly don't think anyone *owes* me anything. If the spec writers > are good enough to spare some time and answer some of my questions I am > very grateful, but it is *not* their obligation. I guess this goes right down to the heart of the question of XML's intended audience. My impression was that XML was intended primarily as a simple markup language for the web. If XML is just a hyped up subset of SGML, then what good does it buy me or the majority of the web as a tool for the general user-audience. After all HTML is crap, but tons and tons of people with absolutely no programming experience can pick it up rather fast. I feel the same can be said of XML if you ignore namespaces. If "Namespaces in XML" are dropped from XSL and remain an optional layer on top of XML, then I would stop complaining right now as "Namespaces in XML" will die off on its own because only a very few people will want to use it. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Sat Feb 6 20:08:37 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:44 2004 Subject: What Clean Specs Achieve Message-ID: <A26F84C9D8EDD111A102006097C4CD0D05497C@SOHOS002> Tyler Baker wrote: > I guess this goes right down to the heart of the question of > XML's intended audience. My > impression was that XML was intended primarily as a simple > markup language for the web. If > XML is just a hyped up subset of SGML, then what good does it > buy me or the majority of the > web as a tool for the general user-audience. It buys you loads. Imagine you want to write a tool to do presentations. Imagine you want it viewed by lots of people all over the world. If you use a Shockwave DTD for your file format then it will be. Imagine you want to embed some data into your reports that can be used by a Excel - graphed, pivoted, sorted, etc. - you could put that section of the data into Office 200 format. Imagine you want to spell check all your invoices. You could pass your data to a general purpose spell checker that doesn't just understand Word, or understand OLE documents, but understands ANY document in the entire world! The productivity increases are just too big too take in. Hyped up? > After all HTML > is crap, but tons and tons of > people with absolutely no programming experience can pick it > up rather fast. Or slow if they go and read the HTML 4.0 spec. Most intelligent people would start with 'HTML in a day'. Many on this list patronisingly think that the average user is stupid enough to want to waste their time taking the long way round. They've got better things to do! Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Sat Feb 6 20:15:37 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:44 2004 Subject: What Clean Specs Achieve Message-ID: <A26F84C9D8EDD111A102006097C4CD0D05497D@SOHOS002> Simon St.Laurent to: > The cost of hitting your head against the specs can be prohibitive, > however, if you're a small organization without extensive > resources. More > complex specs mean that you have to spend more resources comprehending > those specs, wasting time that could have been better spent coding. Equally we could spend less time discussing the clarity of specs and more time trying to understand them! Anyway, the point I made in the last message was that my company has SAVED money by adopting XML - even taking into account after investing a lot of time to understand it and its implications - and we are a VERY small company. I don't know why I'm pursuing this, but here goes: there is a flaw in the logic you are following here. When technology is new and leading edge, it is generally going to be difficult to follow, because we do not have the intellectual reference points with which to understand it. There is no *absolute* measure of whether a spec is easy or difficult to understand, since it depends on the general culture of understanding. As each spec comes in, we find it easier to understand, not because it is better written, but because we are building on the previous layers of our knowledge. And then, when someone makes another big paradigm shift, we'll all be at sea again for a while. So, for you to contrast 'understanding' with 'productivity' is mistaken, because, firstly, if you do not understand the implications of a new technology, what are you going to code up anyway? Are we really worried about the ability of someone at home using a simple text editor to code up their video collection? If they wanted to do that they would be better off with a spreadsheet or writing a database app - and these tools should use XML as their native file formats. And secondly, as I said about our company, programmers do gain in the long run, because the new technology is more efficient than the old. Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Sat Feb 6 20:23:44 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:08:44 2004 Subject: ANN: xmlproc 0.60 Message-ID: <wk3e4jcmdz.fsf@ifi.uio.no> xmlproc is a validating XML parser written in Python, which supports SAX 1.0, XML namespaces, SGML Open catalog files and XCatalog 0.1. The parser can be used for both well-formedness parsing as well as validating parsing, and the DTD parser in the package can be used separately. xmlproc can report errors in Norwegian and English, and more languages can easily be added. xmlproc can be found at: <URL:http://www.stud.ifi.uio.no/~larsga/download/python/xml/xmlproc.html> --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Sat Feb 6 20:46:39 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:44 2004 Subject: What Clean Specs Achieve References: <A26F84C9D8EDD111A102006097C4CD0D054977@SOHOS002> <36BC9DCA.C82CCC03@infinet.com> Message-ID: <36BCA969.C6D66CA9@manhattanproject.com> Tyler Baker wrote: > If XML is not going to be simple, why use XML at all when there > supposedly are much more powerful and well-established standardized > alternatives like SGML in existence that get the job done. This is my understanding: * XML is for information interchange on a large international scale. * SGML was primarily created for internal manuals and specifications. Computationally, SGML has irregular structures that *require* the DTD to be known before the file can be parsed. XML does not have this restriction, it's syntax is independent of the "architecture" or DTD. More than that, this change has had only minimal "reduction" of it's usefulness, i.e., it is harder for harder for humans to diectly author in the language. This simplification has drastically reduced its computational complexity, thus enabling it to be applied in many more contexts. Namespaces is the mechanism to keep all of those contexts from colliding with each other. Architectures is the mechanism that provides the mapping between those contexts. It is this greater applicability that is driving the need for namespaces. This *is* new. No existing document interchange "syntax" has gotten this far, I would say that the INI file format and the CVS file format would have been the runners-up to XML. The complexity to which a computer program express itself using the XML syntax is far greater than CVS or INI syntax. Think of XML as a better "CVS" or "INI" format, not as a weakened SGML. This is the better metaphor. See, SGML and most other "exchange" mechanisms in the past have tied the "syntax" and the "semantics" together. XML is different. It clearly defines the syntax and leaves the "semantics" to the application of the technology. The "DTD" is optional. And Architectures allows you to have more than one DTD. This way each party to the communication can have their own interpretation of the exchange. Seperating these two is a _hudge_ leap forward in software systems. Anyway, this is my view of things. I hope it helps. :) Clark Evans xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Sat Feb 6 20:48:50 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:44 2004 Subject: What is XML? (Was: Re: What Clean Specs Achieve) References: <A26F84C9D8EDD111A102006097C4CD0D054977@SOHOS002> <36BC9DCA.C82CCC03@infinet.com> Message-ID: <36BCA9A0.86A77C68@manhattanproject.com> Tyler Baker wrote: > If XML is not going to be simple, why use XML at all when there > supposedly are much more powerful and well-established standardized > alternatives like SGML in existence that get the job done. This is my understanding: * XML is for information interchange on a large international scale. * SGML was primarily created for internal manuals and specifications. Computationally, SGML has irregular structures that *require* the DTD to be known before the file can be parsed. XML does not have this restriction, it's syntax is independent of the "architecture" or DTD. More than that, this change has had only minimal "reduction" of it's usefulness, i.e., it is harder for harder for humans to diectly author in the language. This simplification has drastically reduced its computational complexity, thus enabling it to be applied in many more contexts. Namespaces is the mechanism to keep all of those contexts from colliding with each other. Architectures is the mechanism that provides the mapping between those contexts. It is this greater applicability that is driving the need for namespaces. This *is* new. No existing document interchange "syntax" has gotten this far, I would say that the INI file format and the CVS file format would have been the runners-up to XML. The complexity to which a computer program express itself using the XML syntax is far greater than CVS or INI syntax. Think of XML as a better "CVS" or "INI" format, not as a weakened SGML. This is the better metaphor. See, SGML and most other "exchange" mechanisms in the past have tied the "syntax" and the "semantics" together. XML is different. It clearly defines the syntax and leaves the "semantics" to the application of the technology. The "DTD" is optional. And Architectures allows you to have more than one DTD. This way each party to the communication can have their own interpretation of the exchange. Seperating these two is a _hudge_ leap forward in software systems. Anyway, this is my view of things. I hope it helps. :) Clark Evans xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dan at holle.demon.co.uk Sat Feb 6 22:04:48 1999 From: dan at holle.demon.co.uk (Dan Holle) Date: Mon Jun 7 17:08:44 2004 Subject: DTD: Extra Complexity? In-Reply-To: <wk3e4jcmdz.fsf@ifi.uio.no> Message-ID: <000001be521b$991c0ea0$0400a8c0@dan.perrysfield> Many applications I've seen, and a few that I have created, don't validate the XML against a DTD. Is the DTD an extra step, inherited from SGML, that doesn't really fit XML? --dan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Sat Feb 6 22:21:11 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:44 2004 Subject: DTD: Extra Complexity? References: <000001be521b$991c0ea0$0400a8c0@dan.perrysfield> Message-ID: <36BCBF9A.C168DDC3@manhattanproject.com> Dan Holle wrote: > Many applications I've seen, and a few that I have > created, don't validate the XML against a DTD. > Is the DTD an extra step, inherited from SGML, > that doesn't really fit XML? XML defines the basic syntax (elements, attributes, entities) A DTD defines how the syntax is structured, i.e., the relationships among the elements and attributes. First, a DTD is optional. This will depend upon your context. If an XML stream has one and only one set of structural rules which define the document, then a single DTD is appropriate. Second, when you have many users of the XML stream, each with different needs, a single DTD dosn't work. You need many. This is what architectural forms allows to happen. It super-imposes the structure of one or more DTD's upon an XML stream. In this case, the DTD declaration is omitted, and another syntax is used to bind the DTD to the document. Third, if it is hard to define "when" the stream begins or ends (i.e. it's not a file), or if the DTD is implicitly understood at both the source and the destination of the message, then it is perfectly acceptable to omit the DTD. Does help? :) Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sblackbu at erols.com Sat Feb 6 22:44:43 1999 From: sblackbu at erols.com (Samuel R. Blackburn) Date: Mon Jun 7 17:08:44 2004 Subject: Extra Complexity? Message-ID: <003101be5222$38f5ef80$01010101@sammy> It depends on how you use XML. If you use it to transfer data between applications then DTD's are completely useless. Their assumption that the world is flat is inappropriate for data applications. Also, the validations performed using DTD's don't buy you anything. The application must perform its own validation based upon some business rules. DTD's allow you to "validate" that a field contains a number but you can't use DTD's to "validate" that a field contains a prime number (that is an application layer validation). If you want to replace HTML (i.e. pretty text) then DTD's become useful. HTH, Sam http://ourworld.compuserve.com/homepages/sam_blackburn -----Original Message----- From: Dan Holle <dan@holle.demon.co.uk> To: xml-dev@ic.ac.uk <xml-dev@ic.ac.uk> Date: Saturday, February 06, 1999 5:07 PM Subject: DTD: Extra Complexity? >Many applications I've seen, and a few that I have created, don't validate >the XML against a DTD. > >Is the DTD an extra step, inherited from SGML, that doesn't really fit XML? > >--dan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Sat Feb 6 22:51:48 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:08:44 2004 Subject: Extra Complexity? In-Reply-To: <003101be5222$38f5ef80$01010101@sammy> from "Samuel R. Blackburn" at Feb 6, 99 05:44:13 pm Message-ID: <199902062342.SAA12353@locke.ccil.org> Samuel R. Blackburn scripsit: > It depends on how you use XML. If you use it to transfer > data between applications then DTD's are completely useless. Not so. DTDs make sure that container elements have the appropriate content, provide default information, and allow access to non-XML components in a standardized way. They also permit the representation of data that is not a tree, and even allow datatype declarations. Furthermore, they allow limited amounts of data reuse. > Their assumption that the world is flat is inappropriate for > data applications. What do you mean by "flat"? > Also, the validations performed using DTD's > don't buy you anything. The application must perform its own > validation based upon some business rules. DTD validation is often not sufficient, but that does not mean that it is not useful. > DTD's allow you > to "validate" that a field contains a number but you can't use > DTD's to "validate" that a field contains a prime number (that > is an application layer validation). In fact, XML DTDs do *not* allow you to validate that a "field" (whether than means an attribute value or #PCDATA content) is numeric. > If you want to replace HTML (i.e. pretty text) then DTD's become > useful. They are useful for far more than that. Documents are complex data, and simple data can also benefit from what is downright essential for complex data. -- John Cowan cowan@ccil.org e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Sat Feb 6 23:06:19 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:08:44 2004 Subject: MDSAX beta2 includes filters for namespace and architectural forms Message-ID: <001201be5224$b199cea0$c9a8a8c0@thing2> MDSAX 1.0 beta 2 is now available at: http://www.jxml.com/mdsax/index.html This release includes a context markup language for defining filter structures. A number of filters are included with this release, among them o John Cowan's namespace filter and o David Megginson's XAF filter for Architectural Forms. This is Open Source Software: http://www.jxml.com/License.txt Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jamesr at steptwo.com.au Sat Feb 6 23:47:34 1999 From: jamesr at steptwo.com.au (James Robertson) Date: Mon Jun 7 17:08:45 2004 Subject: Namespaces and interoperability (was Re: SAX, Java, and Namespaces) In-Reply-To: <8025670F.005165A0.00@mailhost.agora.co.uk> Message-ID: <4.1.19990207104017.00c3ebe0@steptwo.com.au> At 00:48 6/02/1999 , hpyle@agora.co.uk wrote: | Paul Prescod wrote, | > I don't think that "average developers" need to worry about namespaces. | ... | > If you are building a | > typical one-organization application then what are you doing with "other | > people's tags" in your documents? | | Maybe my perspective is a little warped. I'm working on healthcare | applications in the UK - interoperability will (sometime) become a big | deal. :-) Well, let me put it like this then: You are working out a structured interchange format. If you want structure, you'll need a DTD, otherwise there is absolutely no way of validating that you have got correct data. If you have a DTD, then namespaces are irrelevant: they simply don't work together in any meaningful way. Or to put it another way: allowing arbitrary nesting of someone else's tags doesn't achieve anything. You still have to know what to do with them when you receive them. In summary: get stuck into writing a DTD, and ignore this whole "namespaces" mess ... J ------------------------- James Robertson Step Two Designs Pty Ltd SGML, XML & HTML Consultancy http://www.steptwo.com.au/ jamesr@steptwo.com.au "Beyond the Idea" ACN 081 019 623 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jamesr at steptwo.com.au Sat Feb 6 23:51:50 1999 From: jamesr at steptwo.com.au (James Robertson) Date: Mon Jun 7 17:08:45 2004 Subject: Average developers (was Re: Colonialism, SAX, Java, and Namespaces) In-Reply-To: <199902051513.KAA01453@hesketh.net> References: <36BAF974.6F87E679@prescod.net> <8025670F.002EF9AD.00@mailhost.agora.co.uk> Message-ID: <4.1.19990207104529.00b6e260@steptwo.com.au> At 01:12 6/02/1999 , Simon St.Laurent wrote: | I would really appreciate if someday the people building W3C specs would | acknowledge that 'average developers' actually do have to worry about | namespaces, notations, parameter entities, include/ignore sections, and | trying to read the specs themselves. If they would then take that knowledge | and apply it to the specification-writing process, from start to finish, we | might be able to move forward with a lot less back-and-forth about what | these things are really supposed to mean. Simon, I would ask: _why_ does the average developer need all this complexity? I've been tacking jobs big and small, in the real world, for some time now. And I haven't bothered with any of it. Simple DTDs can take you a hell of a long way ... (Namespaces are simply meaningless in 99% of real-world apps, for without DTDs, you have nothing.) Just my $0.02 of course ... ;-) J ------------------------- James Robertson Step Two Designs Pty Ltd SGML, XML & HTML Consultancy http://www.steptwo.com.au/ jamesr@steptwo.com.au "Beyond the Idea" ACN 081 019 623 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jamesr at steptwo.com.au Sun Feb 7 00:00:09 1999 From: jamesr at steptwo.com.au (James Robertson) Date: Mon Jun 7 17:08:45 2004 Subject: RDF (was Re: Colonialism, SAX, Java, and Namespaces) In-Reply-To: <3.0.5.32.19990205085933.009b1a30@library.berkeley.edu> References: <36BB0D56.EDF427B0@prescod.net> <8025670F.002EF9AD.00@mailhost.agora.co.uk> <199902051513.KAA01453@hesketh.net> Message-ID: <4.1.19990207105247.00c75ca0@steptwo.com.au> At 02:59 6/02/1999 , Jerome McDonough wrote: | I would argue, actually, that most average developers are going to have | to deal with RDF. With the proliferation of information that all | organizations are having to cope with, metadata to keep track of the | information will be more and more essential. And RDF seems to be gaining | significant mindshare as the way to store metadata among the SGML/XML | crowd. If that trend continues, any developer working in medium-to-large | organizations is going to be dealing with RDF, and hence, namespaces. | This doesn't necessarily argue for any changes in the namespace spec, | but assumptions that the average developer isn't going to have to deal | with RDF (or any other metadata encoding standard) may be a bit rash. Talking about RDF ... At the last XML conference in Sydney, there was a speaker presenting this wonderful new standard called RDF. I took the opportunity to ask a few questions ... Since RDF uses namespaces, it obviously doesn't have a DTD. Now since the intention is to store a lot of data in RDF, how do we check that a RDF file is correct and meaningful? How do we validate it? (Yes, this is a naive question.) J ------------------------- James Robertson Step Two Designs Pty Ltd SGML, XML & HTML Consultancy http://www.steptwo.com.au/ jamesr@steptwo.com.au "Beyond the Idea" ACN 081 019 623 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Sun Feb 7 00:13:46 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:08:45 2004 Subject: Namespaces and DTDs was RE: Namespaces and interoperability (was Re: SAX, Java, and Namespaces) In-Reply-To: <4.1.19990207104017.00c3ebe0@steptwo.com.au> Message-ID: <000101be522e$12c53580$d3228018@jabr.ne.mediaone.net> James Robertson wrote: > > Well, let me put it like this then: > > You are working out a structured interchange format. If you > want structure, you'll need a DTD, otherwise there is absolutely > no way of validating that you have got correct data. > > If you have a DTD, then namespaces are irrelevant: they simply > don't work together in any meaningful way. > > Or to put it another way: allowing arbitrary nesting of someone > else's tags doesn't achieve anything. You still have to know > what to do with them when you receive them. > > In summary: get stuck into writing a DTD, and ignore this > whole "namespaces" mess ... > It has been previously discussed that we need to use the namespace prefix as declared in the DTD in order to validate XML documents with namespaces. Would explicit declaration of namespace URIs as an attribute allow *namespace aware validators* to correctly validate documents in namespace URI dependent way as opposed to a prefix dependent way. Use attribute declarations to declare namespaces: <!ELEMENT example (a|b|aaa:p|xxx:y)> <!ATTLIST example xmlns:aaa CDATA #DEFAULT "urn:aaa" xmlns:xxx CDATA #DEFAULT "urn:xxx"> <!ELEMENT xxx:y (#PCDATA)> <!ELEMENT html:p (#PCDATA)> so <example> <aaa:p> whatever </aaa:p> <xxx:y> something else </xxx:y> </example> is valid as well as <example xmlns:bbb="urn:aaa" xmlns:yyy="urn:xxx"> <bbb:p> another </bbb:p> <yyy:y> example </yyy:y> </example> Is this what has been suggested before (and I am just getting it :-)? Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Sun Feb 7 02:03:07 1999 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:08:45 2004 Subject: Extra Complexity? Message-ID: <00ed01be523d$d825ef40$87addccf@ix.netcom.com> >Many applications I've seen, and a few that I have created, don't validate >the XML against a DTD. > >Is the DTD an extra step, inherited from SGML, that doesn't really fit XML? > The real value of a DTD is as a check on the author, to make sure that the document has a consistent structure. If I am searching through a document using the DOM it is always nice to know that myDoc.firstChild.lastChild.firstChild.nodeValue will access the content of the same kind of element. If I am building a document by machine I will not always use a DTD, but if a fallible human has access to the document it should always be validated after any 'hand-rolled' change is made. That way I know that my document has a consistent structure. In summary when I am authoring a document I will always check its validity, but when displaying a document I will not check it for validity, only for well formedness. Frank Frank Boumphrey XML and style sheet info at Http://www.hypermedic.com/style/index.htm Author: - Professional Style Sheets for HTML and XML http://www.wrox.com CoAuthor: XML applications from Wrox Press, www.wrox.com Author: Using XML on the Web (March) ----- Original Message ----- From: Dan Holle <dan@holle.demon.co.uk> To: <xml-dev@ic.ac.uk> Sent: Saturday, February 06, 1999 4:56 PM Subject: DTD: Extra Complexity? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Sun Feb 7 03:42:46 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:08:45 2004 Subject: A critique of XML-RPC Message-ID: <199902070433.XAA23875@locke.ccil.org> I have read the XML-RPC specification at http://www.scripting.com/frontier5/xml/code/rpc.html with great interest. I have the following issues with it: 1) There is no support for internationalization, despite the support present in XML. Since the MIME type is text/xml (as opposed to application/xml), the character encoding is US-ASCII unless overridden. No mention is made of support for character references like † (DOUBLE DAGGER). I would suggest supporting either "text/xml; charset='utf-8'". In addition, the references to "ASCII" in the spec should be changed. 2) There is no support for integers longer than 32 bits. I suggest allowing <int> values to be arbitrarily large, reserving the <i4> tag for 32-bit signed values. This would be an upward compatible extension for senders; receivers would have to check whether <int> data was in fact within the 32-bit signed range if backward compatibility is desired. 3) Floats are fairly useless because no rules exist for setting limits. I suggest that no receiver be allowed to reject a value which can be represented in 32-bit IEEE floats: between 1e-149 and 1e104, positive or negative, or zero. 4) The statement that "A string can be used to encode binary data" cannot be true, because arbitrary binary data cannot appear in XML documents: there is no way to represent bytes of value 0-8, 11-12, or 14-31. This is only a documentation consideration, as the base64 element does allow the representation of arbitrary binary data. 5) The very limited fault struct means that more complex exceptions such as Java, Python, or C++ support must be flattened into strings for return to the client, even though XML-RPC has ways of encoding more complex objects. I suggest allowing a struct within a fault object. -- John Cowan cowan@ccil.org e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Sun Feb 7 04:12:32 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:08:45 2004 Subject: What Clean Specs Achieve In-Reply-To: <3.0.32.19990206100726.00c02210@pop.intergate.bc.ca> Message-ID: <000201be524f$72cec4c0$d3228018@jabr.ne.mediaone.net> We should conclude that simple is good. Efforts like the 'annotated XML spec' are a big help. In most specs a few well chosen examples greatly add to formalisms. Questions and controversies are bound to happen, especially with new technology. The proof will not be the existence of XML parsers, rather applications which are adopted by the general public (e.g. html). specs alone can't take us there. > > > >But is anyone here trying to _implement_ Java? Lots of folks here are > >indeed trying to _implement_ XML 1.0 (parsers and SAX), XLink > and XPointer, > >Namespaces, XSL, etc. It's not like we're only trying to _use_ > them, as is > >the case with Java (or SQL, another example that's been bounced > around.) > > Most of them seem to be succeeding. What should we conclude? -Tim > Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Sun Feb 7 07:07:22 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:08:45 2004 Subject: CORBA's not boring yet. / XML in an OS? In-Reply-To: Your message of "Thu, 04 Feb 1999 16:36:49 PST." <3.0.6.32.19990204163649.00f40e20@scripting.com> Message-ID: <199902070709.AAA07204@malatesta.local> > Gotta check it out! XML-RPC.. What CORBA wants to be. ;-> > > http://www.xmlrpc.com/ <DeLurkAndFlameOn> For one who recently made a clumsy swipe at another to the effect that they like to hear themselves talk, you are no slouch in the department of vacuous, self-promotional, and frequently off-topic cant. First I read here that CORBA is "heavyweight", and now that it is a mere shadow of XML-RPC. I guess one can say anything about an unrelated technology from the safe confines of an XML list. </DeLurkAndFlameOn> -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Sun Feb 7 07:40:59 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:08:45 2004 Subject: Component Markup Language Message-ID: <0bec01be526d$42222740$0300000a@othniel.cygnus.uwa.edu.au> >> >Has anyone thought about or worked on an markup language to describe a >> >User Interface in a platform independent way? About a year ago I toyed with the idea of doing an XML representation of Visual Basic forms. I didn't put too many cycles into it because I figured Microsoft would do it at some stage. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Sun Feb 7 08:04:15 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:08:45 2004 Subject: CORBA's not boring yet. / XML in an OS? Message-ID: <0d2701be5270$824dbca0$0300000a@othniel.cygnus.uwa.edu.au> >Anyhow, this naturally makes me wonder - could XML and related ideas >like XSL have a place in an operating system? Where would they fit in? >KDE and Gnome could be great playgrounds for trying something like this >out. For a while now, I've been thinking what an OS (or more likely shell) would look like if it took Unix's "everything as a file" to "everything as an XML element". A system would be a single XML "uberdocument" (physically, separate entities, including unparsed for any non-XML files on the system but logically, the one XML document). Applications (which would themselves be nodes in the element tree) would operate on other nodes in the element tree. There would be an application, for example, that got mail via POP or IMAP, represented it in XML and then attached it a particular point in the uberdocument. XSL could be used to sort the mail. XSL would also be used to view the mail. It's XML for the sake of it, but I think it would be fun to try out. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Sun Feb 7 08:09:15 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:08:45 2004 Subject: What Clean Specs Achieve, WAS: Colonialism, SAX, Java, and Na mespaces In-Reply-To: Your message of "Sat, 06 Feb 1999 17:53:24 GMT." <A26F84C9D8EDD111A102006097C4CD0D054976@SOHOS002> Message-ID: <199902070810.BAA07298@malatesta.local> Bill la Forge: > > One of the big advantages of Java is that a small shop can > > tackle significant projects. With clean specs, the same will > > be true for XML. Mark Birbeck: > Hands up, who has read the Java spec (and that's not the same as reading > the nice clear instructions given to you by the people who wrote the > compiler)? I have. Both the core language and the library specs, about three years ago (just about when 1.0 came out). They were very simple, clear, and even more useful than many of the textbooks I have since seen. They were certainly superior to any of the W3C specs I've read (and I've read HTML 4.0, XML 1.0, Namespaces, DOM 1.0, XLink, XPointer, and XSL). In fact, despite what Paul says, I often try to learn new technologies by reading the specs directly. I have had varying success, but I don't intend to change my habits any time soon, despite my experiences with the W3C. A lot of people appear to be ducking the fact that one _can_ write clear, concise and readable specs without sacrificing the necessary precision and formality: it just requires effort. I don't know whether the problems with W3C specs come from lack of inclination towards such effort, or lack of resources for such effort. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Sun Feb 7 08:25:31 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:08:45 2004 Subject: What Clean Specs Achieve In-Reply-To: Your message of "Sat, 06 Feb 1999 10:07:48 PST." <3.0.32.19990206100726.00c02210@pop.intergate.bc.ca> Message-ID: <199902070827.BAA07349@malatesta.local> > >But is anyone here trying to _implement_ Java? Lots of folks here are > >indeed trying to _implement_ XML 1.0 (parsers and SAX), XLink and XPointer, > >Namespaces, XSL, etc. It's not like we're only trying to _use_ them, as is > >the case with Java (or SQL, another example that's been bounced around.) > > Most of them seem to be succeeding. What should we conclude? -Tim I have worked on teams implementing DOM, XSL and parts of XLL. Some users might grant that we have been "succeeding", but let me assure you that if so, it is despite the W3C specs, not because of them. DOM, especially is unforgivably inconsistent, incomplete, and unclear for a production-ready (1.0) specification. Then there is the matter that it blithely violates other specs, such as CORBA's IDL, making weak excuses all the while ("but we had to support ECMAScript, don't you know?") Others here have spoken for me as to the confusing nature of the Namespaces spec, and what I find interesting is that you claim you were trying to make that spec simple. As another has said, clearly if such intelligent people are so baffled by a document whose scope and effect is meant to be simple, there are likely problems with the document. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Sun Feb 7 08:55:34 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:08:45 2004 Subject: CORBA's not boring yet. / XML in an OS? Message-ID: <01fe01be5276$e1d65050$5402a8c0@oren.capella.co.il> >James Tauber <jtauber@jtauber.com> wrote: >>For a while now, I've been thinking what an OS (or more likely shell) would >>look like if it took Unix's "everything as a file" to "everything as an XML >>element". > > >Nice thought. Well, since we're talking about exotic systems, why don't you >start with Plan9, which takes "everything is a file" much beyond a normal >UNIX. Maybe you could "simply" wrap it with an XML driver layer... Probably >not but it would be an interesting study. Anyone looking for an operating >system related thesis subject? :-) > >Have fun, > > Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Sun Feb 7 10:39:00 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:08:45 2004 Subject: document vs non-document entity (was Re: CORBA's not boring yet. / XML in an OS?) Message-ID: <0e2401be5286$1f471500$0300000a@othniel.cygnus.uwa.edu.au> One (serious) issue that imediately arises out of my XML ?berdocument system is that there is a difference in XML between the document entity and other entities: the existence of the prolog. If an XML document has a empty prolog, there is no problem because an XML document entity with an empty prolog is a legal external parsed entity. However, the moment you have an XML declaration or document type declaration, the entity can no longer act as an external parsed entity. James -- James Tauber / jtauber@jtauber.com / www.jtauber.com Associate Researcher, Electronic Commerce Network Curtin University of Technology, Perth, Western Australia Maintainer of : www.xmlinfo.com, www.xmlsoftware.com and www.schema.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Sun Feb 7 13:39:05 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:08:45 2004 Subject: document vs non-document entity (was Re: CORBA's not boring yet. / XML in an OS?) In-Reply-To: <0e2401be5286$1f471500$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: <199902071338.IAA13969@hesketh.net> At 06:38 PM 2/7/99 +0800, James Tauber wrote: >One (serious) issue that imediately arises out of my XML ?berdocument system >is that there is a difference in XML between the document entity and other >entities: the existence of the prolog. > >If an XML document has a empty prolog, there is no problem because an XML >document entity with an empty prolog is a legal external parsed entity. >However, the moment you have an XML declaration or document type >declaration, the entity can no longer act as an external parsed entity. I like the uberdocument OS concept very much (as those who suspect my XML-everywhere sympathies probably guessed.) The doctype declaration issue is a significant problem for a single-document model. I think there may be two ways out, however: 1) Hope that the schema spec uses some other mechanism to connect to documents (and document fragments) that isn't as disruptive. 2) Use a master document that has connections to other documents, using XLink to manage relationships within the set of documents. This way you can have all the prologs and DTDs you like, though you'd need another level of organization. Hmmm... fun idea! I've been thinking a lot about XML as resource files, which is where the thoughts above came from, but you've gone a few orders of magnitude past that. Simon St.Laurent XML: A Primer / Building XML Applications (March) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dave at userland.com Sun Feb 7 13:41:17 1999 From: dave at userland.com (Dave Winer) Date: Mon Jun 7 17:08:45 2004 Subject: A critique of XML-RPC In-Reply-To: <199902070433.XAA23875@locke.ccil.org> Message-ID: <3.0.6.32.19990207054425.00d411c0@scripting.com> John thanks for the interest and comments. I posted them on our discussion group so other people involved with XML-RPC can see them. http://discuss.userland.com/msgReader$2736 Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Sun Feb 7 13:59:35 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:08:45 2004 Subject: Namespaces and DTDs was RE: Namespaces and interoperability (was Re: SAX, Java, and Namespaces) References: <000101be522e$12c53580$d3228018@jabr.ne.mediaone.net> Message-ID: <36BD9D72.D5761541@mecomnet.de> yes, it has been suggested before, and yes it does work. note that, as one is using the uri-bindings, one does not "use the namespace prefix". if the prefixes in the dtd are not identical with those in the document, then one need only establish scoping rules the bindings effected by attlist declarations. note that the approach is unorthodox. it violates one of the unspoken namespace tenets: that they do not affect the validity of a document. time will tell if this holds up. please note, that your example happens to have an unbound "html" prefix. it is missing an attlist declaration to the effect of <!ATTLIST html:p xmlns:html CDATA #DEFAULT "urn:aaa" > i would also suggest that the attlist-based binding is to be preferred to deferring the disambiguation until the document is read: it does not preclude alternative prefixes while at the same time making it possible to determine whether the dtd is complete before beginning to parse/process the document. Borden, Jonathan wrote: > > James Robertson wrote: > > > > If you have a DTD, then namespaces are irrelevant: they simply > > don't work together in any meaningful way. The "any" in this statement is in error. > ... > > It has been previously discussed that we need to use the namespace prefix > as declared in the DTD in order to validate XML documents with namespaces. > > Would explicit declaration of namespace URIs as an attribute allow > *namespace aware validators* to correctly validate documents in namespace > URI dependent way as opposed to a prefix dependent way. > > Use attribute declarations to declare namespaces: > > <!ELEMENT example (a|b|aaa:p|xxx:y)> > <!ATTLIST example > xmlns:aaa CDATA #DEFAULT "urn:aaa" > xmlns:xxx CDATA #DEFAULT "urn:xxx"> > <!ELEMENT xxx:y (#PCDATA)> > <!ELEMENT html:p (#PCDATA)> > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Sun Feb 7 14:24:21 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:08:46 2004 Subject: document vs non-document entity (was Re: CORBA's not boring yet. / XML in an OS?) References: <0e2401be5286$1f471500$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: <36BDA33D.31D92896@mecomnet.de> Which implies that references would be through links rather than entity references and that the focus is not the document, but the element, element-range, or whatever link semantic the OS implements. If the "standard" link is to the root element, then it would have the same effect as the file systems which have distinguish resource from data forks or support file headers, but provide the "data" as the standard file content. It would be XML, just not literal XML. James Tauber wrote: > > One (serious) issue that imediately arises out of my XML ?berdocument system > is that there is a difference in XML between the document entity and other > entities: the existence of the prolog. > > If an XML document has a empty prolog, there is no problem because an XML > document entity with an empty prolog is a legal external parsed entity. > However, the moment you have an XML declaration or document type > declaration, the entity can no longer act as an external parsed entity. > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Sun Feb 7 15:39:18 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:08:46 2004 Subject: Namespace Applications References: <8025670F.002EF9AD.00@mailhost.agora.co.uk> <36BAF974.6F87E679@prescod.net> <14010.65394.42870.480866@localhost.localdomain> <36BC80C6.B2950AF6@prescod.net> Message-ID: <36BDB4C7.453E619A@mecomnet.de> Paul Prescod wrote: > > David Megginson wrote: > > <email>paul@prescod.net</email> > > <company>ISOGEN</company> > > <a:origin>Canada</a:origin> > > <b:origin>University of Waterloo</b:origin> > > </member> > > > > ... > > > > The advantages of being able to come up with globally-unique names > > should be obvious: > > Actually it isn't to me. The problem is now you have <a:origin> and > <b:origin> element types but you don't know what to do with them. This is > the point I keep harping about: processing expectations. So long as one remains in the "encoded XML" domain, the names are (modulo validation) academic. XML is one thing, and one thing only: a *code* - that is a specified collection of symbols with a collection of permitted relations. Which means that there is no semantics to be concerned with. There should be no exectation that namespaces change that. A semantics appears exactly at the point where the relation to the source and/or target domain is specified. Which cannot happen in the code itself. It happens at the point where a document becomes an *encoding* of something else. At this point one *does* know "what to do with them" because the domain to which one is decoding includes a process specification, or constraints on values, or whatever. At which point the question becomes "is the encoding complete and consistent?" At which point unambiguous names are essential. The essential requirement is not that of correct names. (cf. the <david:CountryOfOrigin> point below.) Neither is it that of correct relations. (cf. the XSL/ EA point below). It is that of unambiguous names. > Clearly > <a:origin> is supposed to be mapped either to nothing or to > <david:CountryOfOrigin> and <b:origin> is to be mapped either to nothing > or to <david:GraduatedFrom>. It seems to me that information should not be > let into my information system until it is expressed in terms that my > information system is familiar with. Iff an application has a "{a=}origin", it can determine that the intended treatment is that of "{david}CountryOfOrigin". "XML Namespaces" permit the application to delay until the point of decoding the binding of an encoded name to a symbol in the application domain. Iff this is possible, then a *direct mapping* to the application domain is possible. Which is why similar facilities must be supported within enabling architectures, which interpose additional encodings as proxies for the eventual application domain. > > What that means is that these things should be shipped with either > architectural declarations or an XSL stylesheet that lets me locally > reinterpret them. Iff there is a mismatch between the immediate decoded domain and the eventual application domain, then there is more to do. I suggest, however, that this is not intrinsically a decoding problem and, as such, has only indirectly to do with namespaces. As noted, it can be handled with mechanisms such as those provided by XSL or enabling architectures, but need not be. Note also, that to literally "ship" documents with a stylesheet begs the question. The eventual binding is to be specified by the receiving process, not the sending process. To augment them at the receiving end is a possibility, but overlooks that there are alternative application mechanisms to do much of that for which one would use architectural forms or XSL transformations. In cases where the application environment does not provide the necessary mechanisms, I'd put my money on XSL, as it supports more extensive transformations than enabling architectures. The combination (namespaces + XSL) also provides a better separation of the two mechanisms. The combination is unavoidable. Unless one is willing to express all XSL patterns in a context dependant form, then unambiguous names remain a requirement and some sort of namespace mechanism is a prerequisite to guaranteed matches. If one is willing to limit patterns to context dependent expressions and to presume complete documents as reference targets, then one would succeed at pushing the problem back to that of ensuring "unique document type names". xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Sun Feb 7 15:39:21 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:46 2004 Subject: "Clean Specs" References: <199902070827.BAA07349@malatesta.local> Message-ID: <36BDAF1D.A908F8EF@prescod.net> What is a "clean spec?" People in this discussion are mixing together a variety of things that I do not consider the same. Uche Ogbuji wrote: > I have worked on teams implementing DOM, XSL and parts of XLL. Some users > might grant that we have been "succeeding", but let me assure you that if so, > it is despite the W3C specs, not because of them. DOM, especially is > unforgivably inconsistent, incomplete, and unclear for a production-ready > (1.0) specification. There is not a SINGLE person on this mailing list that would say that it is right to create specifcations that are inconsistent, incomplete or unclear. It is beyond doubt that specifications in the XML family, including XML itself, have these problems. The question is how to avoid that? * some people say that what the spec needs is "more English". But much of the problem with the namespaces specification comes from ambiguity in the English. * some people say that we need "more non-normative text" but once again, it is a non-normative appendix that is confusing people. What almost nobody has suggested is that we need more formal notation. Now if we look at James Clark's "clarifying" document what we see is *less* English text and *more* notation. As David Megginson has pointed out, when a spec. invents a notation to explain its abstract concepts the spec. becomes temporarily harder to read because you have to learn the notation before the specification. This will turn people off. I remember that Algebraic notation turned me off in my first year math classes. But in the *long run* it makes life easier for everyone. Implementors have precise definitions of what the hell they are supposed to implement. End-users get software they can use. People reading the specification for their own education and edification will understand it better -- if they perservere through the task of learning the notation. I know that W3C spec. writers are under pressure to use more normative English and less normative notation. This is not a vote for "clean" specifications. It is a vote for messy, hard-to-implement ones. Where is Dan Connolly when you need him? Let me point out: the CSS and HTML specifications are easy to read because everyone who wants to read them already understands the basic concepts of web pages and layout. They are concrete implementations of ideas we already understand. The same goes for Java. (and yes, I read the Java specification, but AFTER I already knew Java) I wonder if anyone on this list has ever learned a language that was radically different from what they already knew through the language specification: Scheme, Prolog, APL? Technical writing is damn hard. When you do it right, you must make certain decisions about ordering of concepts and revelations that are the exact opposite of what you do in writing a specification. When you write a spec., you need to present things in the order of fundamental building blocks to high level concepts. In technical writing you will put your students to sleep if you zoom in on details before explaining the general framework. I've told people who want to read the DSSSL specification to start at the back and work forwards. Of course once they become implementors then they read it the opposite way around. Life is hard for implementors. I'd rather be forced to implement the entire suite of XML specifcations over HTML/CSS 2. That isn't a slight against HTML/CSS, it's just an attempt to put things in perspective. Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Sun Feb 7 15:53:16 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:46 2004 Subject: document vs non-document entity (was Re: CORBA's not boring yet. / XML in an OS?) References: <0e2401be5286$1f471500$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: <36BDB21C.36D0EE83@prescod.net> James Tauber wrote: > > One (serious) issue that imediately arises out of my XML ?berdocument system > is that there is a difference in XML between the document entity and other > entities: the existence of the prolog. This is a serious problem inherited from SGML. It is my opinion that instead of namespace declarations being done through attributes they should have been just another form of PI-based declaration. Then in version 2.0 of XML, all declarations should have been made "localizable." In particular, there should be a way to declare internal text entities, unparsed entities and attribute defaults locally. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Sun Feb 7 15:57:05 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:46 2004 Subject: Extra Complexity? References: <003101be5222$38f5ef80$01010101@sammy> Message-ID: <36BDB39B.97433230@prescod.net> "Samuel R. Blackburn" wrote: > > It depends on how you use XML. If you use it to transfer > data between applications then DTD's are completely useless. > Their assumption that the world is flat is inappropriate for > data applications. DTDs model tree structures. > Also, the validations performed using DTD's > don't buy you anything. The application must perform its own > validation based upon some business rules. DTD's allow you > to "validate" that a field contains a number but you can't use > DTD's to "validate" that a field contains a prime number (that > is an application layer validation). So what you are saying is that because DTDs do not do everything you could dream of, they should not be used for anything? A more effective point of view is: "use DTDs for the 90% of the problem that they CAN solve, but expect to require application-specific logic for the 10% that they CANNOT." When "XML Schemas" come about, that number may shift to 95% and 5% but the basic principles will not. I'm fairly confident that there will be no provision for validating that a number is prime, unless it is through an "escape mechanism." DTDs also have an "escape mechanism". Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Sun Feb 7 16:00:31 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:46 2004 Subject: RDF (was Re: Colonialism, SAX, Java, and Namespaces) References: <36BB0D56.EDF427B0@prescod.net> <8025670F.002EF9AD.00@mailhost.agora.co.uk> <199902051513.KAA01453@hesketh.net> <4.1.19990207105247.00c75ca0@steptwo.com.au> Message-ID: <36BDB47A.35407949@prescod.net> James Robertson wrote: > > > Now since the intention is to store a lot of data in > RDF, how do we check that a RDF file is correct > and meaningful? > > How do we validate it? There is such a thing as an "RDF schema". It checks an almost disjoint set of things from what a DTD would check. If you don't care about things being in a particular order and want to treat linking and containment as the same thing, then you should use RDF schemas. If you do care about the order of things, you should use DTDs. You could also use them together to check different things. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Sun Feb 7 16:32:39 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:08:46 2004 Subject: "Clean Specs" In-Reply-To: <36BDAF1D.A908F8EF@prescod.net> Message-ID: <000001be52b6$d54f1d90$d3228018@jabr.ne.mediaone.net> Paul raises some important points here. The question is as to what is the best way to write specs, best being efficient and clear. The namespace spec is often used as an example of an unclear spec. I do not agree with this. When I read the namespace spec it is very clear. My misunderstandings of the spec have come almost entirely from *every thing everyone else has been saying* about namespaces rather than the spec itself. Being somewhat of a neophyte, I assumed that there existed concepts within about namespaces that I missed from the very simple spec. The simplicity of the namespace spec was driven home by the examples James Clark has given. To me this is an excellent example of why a spec should be written in 3 parts: a formal specification (BNF or whatever), a plain english description, lots of good examples. Here is my plain english distillation of namespaces: Namespaces are a way to distinguish element and attribute names using URI's so that element and attribute names can be reused without causing problems for software. << formal description here >> For example ... Paul is correct that formalisms obfuscate specs for neophytes. The problem is that english is ambiguous. For those of us who can't read formalisms nor english a good set of examples are needed. The best RFCs contain these three parts. Jonathan Borden http://jabr.ne.mediaone.net > > > What is a "clean spec?" > > People in this discussion are mixing together a variety of things that I > do not consider the same. > > Uche Ogbuji wrote: > > I have worked on teams implementing DOM, XSL and parts of XLL. > Some users > > might grant that we have been "succeeding", but let me assure > you that if so, > > it is despite the W3C specs, not because of them. DOM, especially is > > unforgivably inconsistent, incomplete, and unclear for a > production-ready > > (1.0) specification. > > There is not a SINGLE person on this mailing list that would say that it > is right to create specifcations that are inconsistent, incomplete or > unclear. It is beyond doubt that specifications in the XML family, > including XML itself, have these problems. The question is how to avoid > that? > > * some people say that what the spec needs is "more English". But much of > the problem with the namespaces specification comes from ambiguity in the > English. > > * some people say that we need "more non-normative text" but once again, > it is a non-normative appendix that is confusing people. > > What almost nobody has suggested is that we need more formal notation. Now > if we look at James Clark's "clarifying" document what we see is *less* > English text and *more* notation. > > As David Megginson has pointed out, when a spec. invents a notation to > explain its abstract concepts the spec. becomes temporarily harder to read > because you have to learn the notation before the specification. This will > turn people off. I remember that Algebraic notation turned me off in my > first year math classes. But in the *long run* it makes life easier for > everyone. Implementors have precise definitions of what the hell they are > supposed to implement. End-users get software they can use. People reading > the specification for their own education and edification will understand > it better -- if they perservere through the task of learning the notation. > > I know that W3C spec. writers are under pressure to use more normative > English and less normative notation. This is not a vote for "clean" > specifications. It is a vote for messy, hard-to-implement ones. Where is > Dan Connolly when you need him? > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jmg at trivida.com Sun Feb 7 16:35:58 1999 From: jmg at trivida.com (Jeff Greif) Date: Mon Jun 7 17:08:46 2004 Subject: Extra Complexity? References: <003101be5222$38f5ef80$01010101@sammy> Message-ID: <36BDC065.2667EC43@trivida.com> This seems a bit too thorough a rejection of DTD's for content validation. It is helpful in writing the application validation code for the prime number to be able to expect to have a certain kind of element or attribute in a particular place that is the thing to be checked for primality, and to know that a large subset of the other prerequisites needed for this kind of check must have already been satisfied. I find the parser's validation against a DTD valuable in my application's code which is doing just this sort of thing. It saves lots of checking for various errors that cannot happen if the document is known to be valid according to the DTD. Jeff "Samuel R. Blackburn" wrote: > It depends on how you use XML. If you use it to transfer > data between applications then DTD's are completely useless. > Their assumption that the world is flat is inappropriate for > data applications. Also, the validations performed using DTD's > don't buy you anything. The application must perform its own > validation based upon some business rules. DTD's allow you > to "validate" that a field contains a number but you can't use > DTD's to "validate" that a field contains a prime number (that > is an application layer validation). > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Sun Feb 7 16:58:19 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:08:46 2004 Subject: "Clean Specs" In-Reply-To: <36BDAF1D.A908F8EF@prescod.net> References: <199902070827.BAA07349@malatesta.local> Message-ID: <199902071658.LAA16110@hesketh.net> At 09:19 AM 2/7/99 -0600, Paul Prescod wrote: >What is a "clean spec?" Good question, one with many answers. >People in this discussion are mixing together a variety of things that I >do not consider the same. > >Uche Ogbuji wrote: >> I have worked on teams implementing DOM, XSL and parts of XLL. Some users >> might grant that we have been "succeeding", but let me assure you that if so, >> it is despite the W3C specs, not because of them. DOM, especially is >> unforgivably inconsistent, incomplete, and unclear for a production-ready >> (1.0) specification. > >There is not a SINGLE person on this mailing list that would say that it >is right to create specifcations that are inconsistent, incomplete or >unclear. It is beyond doubt that specifications in the XML family, >including XML itself, have these problems. The question is how to avoid >that? Glad to hear that we're actually arguing toward a common purpose. > * some people say that what the spec needs is "more English". But much of >the problem with the namespaces specification comes from ambiguity in the >English. > > * some people say that we need "more non-normative text" but once again, >it is a non-normative appendix that is confusing people. > >What almost nobody has suggested is that we need more formal notation. Now >if we look at James Clark's "clarifying" document what we see is *less* >English text and *more* notation. What we need is text that conforms to the formal notation, and lots of examples that are checked hard against both the text and the formalisms for accuracy. This is extraordinarily difficult to do right, but it results in standards that are both solid and understandable. To a considerable extent this demands that spec writers see themselves as implementors - and probably that they include implementors in the process, especially implementors who don't have prior experience in whatever standards provided the foundation of the current project. The story for XML 1.0 of using Peter Murray-Rust as a canary is a good one, though I'd like to see more of that in the actual group of people writing the specs, not just the surrounding groups. One other point: putting explanatory text and examples into 'non-normative' sections is pretty much equivalent to abandoning them, announcing that the writers didn't put the effort to ensure that they were completely conformant with the formal specification and therefore equally normative. Making the _entire_ spec normative, both the formalisms and the explanatory text, seems like a more likely approach to succeed, though again it requires (considerable) additional effort on the part of the specification writers. >As David Megginson has pointed out, when a spec. invents a notation to >explain its abstract concepts the spec. becomes temporarily harder to read >because you have to learn the notation before the specification. This will >turn people off. I remember that Algebraic notation turned me off in my >first year math classes. But in the *long run* it makes life easier for >everyone. Implementors have precise definitions of what the hell they are >supposed to implement. End-users get software they can use. People reading >the specification for their own education and edification will understand >it better -- if they perservere through the task of learning the notation. Are there good resources describing formal notations? I tend to find that the people creating formal notations do so because they're fed up with text, and then do a lousy job describing the notations. All they've done is make the learning curve a hell of a lot steeper, while creating a subculture of experts fluent in that notation who then feel empowered to use that notation to create all kinds of wonderful things that the rest of us can't figure out. I would _never_ write an XML book that started by explaining EBNF and then used that to walk through the spec; I wouldn't expect readers to stick beyond the first few paragraphs. (It has been done before, and there are readers who enjoy that, but in my limited experience they're a very small minority.) >I know that W3C spec. writers are under pressure to use more normative >English and less normative notation. This is not a vote for "clean" >specifications. It is a vote for messy, hard-to-implement ones. Where is >Dan Connolly when you need him? It requires making the effort to align the text and the formalisms. Now that we have XML as a foundation, that task may become simpler; I don't find it very difficult to describe the structures of XML documents and XML DTDs in plain English, even to those without extensive XML experience. Making XML a common formal foundation is a good way out of this mess for lots of standards - though that, of course, requires undertaking the project Paul has described above of making the foundation formalism intelligible. XML won't work as a common foundation for everything, so we'll still be stuck with the other vocabularies for a lot of projects. Spec writers will still have an obligation to write clear text. >Let me point out: the CSS and HTML specifications are easy to read because >everyone who wants to read them already understands the basic concepts of >web pages and layout. They are concrete implementations of ideas we >already understand. The same goes for Java. (and yes, I read the Java >specification, but AFTER I already knew Java) CSS was actually a huge jump for a lot of people in the HTML world; I'd say it was harder than you make it out to be. CSS2 is easy to read because it recognizes that difficulty (more than CSS1 did) and addresses it directly. I've never found the HTML specs to be of much use, though the latest drafts look much more promising. >I wonder if anyone on this list has ever learned a language that was >radically different from what they already knew through the language >specification: Scheme, Prolog, APL? Never tried radically different. I found the ECMA-262 spec for ECMAScript/JavaScript much more useful for syntax than the books I was attempting to use at the time, though that too required some effort. >Technical writing is damn hard. When you do it right, you must make >certain decisions about ordering of concepts and revelations that are the >exact opposite of what you do in writing a specification. When you write a >spec., you need to present things in the order of fundamental building >blocks to high level concepts. In technical writing you will put your >students to sleep if you zoom in on details before explaining the general >framework. It's definitely difficult. I've spent about 100 hours this week documenting assorted XML material to finish a book, and I've seen specs with varying degrees of clarity, from super-sharp to non-existent. (I've also been dealing with numerous cases where text and formalism don't match at all. That's especially fun.) I'm not convinced that technical writing needs to run backward from the way that a specification must be written, however. Starting out with building blocks and constructing grander concepts is perhaps a nice theoretical model, but I don't think I've seen too many specs that follow that construct precisely. Typically, specs at least present the big picture to give readers some idea of why they're bothering, then focus in on details, and then zoom back out. Abstracts and introductions are at least as important to writing a good spec as they are to writing a book, as are glossaries. >Life is hard for implementors. I'd rather >be forced to implement the entire suite of XML specifcations over HTML/CSS >2. That isn't a slight against HTML/CSS, it's just an attempt to put >things in perspective. Actually, once they get through the HTML-in-XML stuff, I think implementing HTML/CSS2 would be a heck of a lot easier than implementing XML/XLink/XPointer/XSL/fragments/XQL/etc. That's your choice, of course. Simon St.Laurent XML: A Primer / Building XML Applications (March) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From srn at techno.com Sun Feb 7 17:09:46 1999 From: srn at techno.com (Steven R. Newcomb) Date: Mon Jun 7 17:08:46 2004 Subject: DTD: Extra Complexity? In-Reply-To: <000001be521b$991c0ea0$0400a8c0@dan.perrysfield> References: <000001be521b$991c0ea0$0400a8c0@dan.perrysfield> Message-ID: <199902071708.LAA01289@bruno.techno.com> [Dan Holle:] > Many applications I've seen, and a few that I have created, don't > validate the XML against a DTD. > Is the DTD an extra step, inherited from SGML, that doesn't really > fit XML? True, there is often no necessity for an application to validate incoming data against a DTD. BUT: DTDs are essential in marketplaces in which open information interchange occurs in a multivendor environment. In this kind of situation, DTDs serve as contracts between information-creating application developers and information-consuming application developers. When information fails to be interchanged successfully (i.e., when things don't work), and if there's no DTD contract, then there's no way to tell who's responsible to make what changes in order to restore successful open information interchange. Software maintenance costs spiral upward, customers get confused and unhappy, and the atmosphere in the marketplace is poisoned. With a DTD contract in place, the reliability of open information interchange is much higher, and the entry cost to software vendors of serving any given marketplace is much more predictable. Now about the necessity of applications performing validation. You're right, it's not strictly necessary. BUT: * Vendors of information-consuming software often wish to incorporate validation of incoming information into their applications in order to deflect blame away from themselves when things don't work right and it's not their fault. It is impossible to create software that understands just any old gobbledygook that happens to come along. * Similarly, vendors of information-creating software often wish to incorporate validation of outgoing information into their applications in order to demonstrate that, if some information-consuming application chokes on it, it is the fault of the information-consuming application and not the fault of the information-creating application. It is impossible to create information that can be understood by just any old information-processing application. So, back to your question: "Is the DTD an extra step, inherited from SGML, that doesn't really fit XML?" The answer is that DTDs are an essential, non-optional feature of XML whenever XML is used in a marketplace of open information interchange that is served by multiple software vendors. Given that XML is supposed to be used on the Web, DTDs are certainly essential to XML's widespread success in enhancing opportunities for open information interchange in a multivendor context. On the other hand, open DTDs are not good news for the ultra-dominant software vendors. Efforts to create industry-standard DTDs are the strategic Manhattan Projects of the ongoing struggle between information owners and software vendors for control of huge libraries of valuable commercial information. Eventually, the information owners are going to win, leaving them in a position to buy their software from the lowest bidder. The Silicon Integration Initiative's ECIX project [www.si2.org] springs to mind as an example. -Steve -- Steven R. Newcomb, President, TechnoTeacher, Inc. srn@techno.com http://www.techno.com ftp.techno.com voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137) fax +1 972 994 0087 (at ISOGEN: +1 214 953 3152) 3615 Tanner Lane Richardson, Texas 75082-2618 USA xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Sun Feb 7 17:26:15 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:08:46 2004 Subject: document vs non-document entity (was Re: CORBA's not boring yet. / XML in an OS?) Message-ID: <002101be52be$e92480a0$0300000a@othniel.cygnus.uwa.edu.au> Paul Prescod: >This is a serious problem inherited from SGML. It is my opinion that >instead of namespace declarations being done through attributes they >should have been just another form of PI-based declaration. Then in >version 2.0 of XML, all declarations should have been made "localizable." >In particular, there should be a way to declare internal text entities, >unparsed entities and attribute defaults locally. Well this is actually where my thinking has been going. I've been thinking about whether localizable declarations might be achievable using an *attribute* mechanism following on from how namespaces ended up. Eg <Root> ... <SomeElement xmldecl="local-decl.pen"> <!-- markup declarations in local-decl.pen apply --> </SomeElement> ... </Root> The xmldecl attribute takes the value of the URI of an external parameter entity. To avoid name clashes, it might be an idea to have xmldecl:foo="local-decl.pen" and then qualify entities with the prefix foo (ie &foo:SomeEntity;) James -- James Tauber / jtauber@jtauber.com / www.jtauber.com Associate Researcher, Electronic Commerce Network Curtin University of Technology, Perth, Western Australia xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Sun Feb 7 17:37:01 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:46 2004 Subject: XSL and namespaces Message-ID: <36BDCC48.741BC90C@prescod.net> Someone pointed out that XSL does in fact work on the URI-replaced namespace, not the prefixed name. I missed this because I was looking for more sophisticated features. I'm not sure yet if the support provided is "sufficient." It seems like we should have a mechanism for dynamically assembling stylesheets BASED ON the namespaces used in a document. But maybe that's not necessary. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Sun Feb 7 17:38:28 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:46 2004 Subject: DTD: Extra Complexity? References: <000001be521b$991c0ea0$0400a8c0@dan.perrysfield> <199902071708.LAA01289@bruno.techno.com> Message-ID: <36BDCD80.180338D8@prescod.net> "Steven R. Newcomb" wrote: > > Given that XML is supposed to be used on the Web, DTDs are certainly > essential to XML's widespread success in enhancing opportunities for > open information interchange in a multivendor context. > > On the other hand, open DTDs are not good news for the ultra-dominant > software vendors. I noticed something ominous in one of the Office 2000 products. It was generating a schema on the fly for the data it was sending. "We know this data conforms to this standard because we invented the standard to fit the data." If the person on the other end can't see the schema *in advance* then they can't code software that works with the data. That's pretty much as bad as not having a schema at all. I don't want to say that generating a schema on the fly is ALWAYS bad. The point should be that you should specify as much as you can in a static schema, (or other formal specification). "Inline extensions" should be a last resort and should be explicitly catered for in the static schema. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Feb 7 19:06:13 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:47 2004 Subject: "Clean Specs" Message-ID: <3.0.32.19990207110448.00c1d440@pop.intergate.bc.ca> At 12:01 PM 2/7/99 -0500, Simon St.Laurent wrote: >To a considerable extent this demands that spec writers see themselves as >implementors - and probably that they include implementors in the process, >especially implementors who don't have prior experience in whatever >standards provided the foundation of the current project. The story for >XML 1.0 of using Peter Murray-Rust as a canary is a good one, though I'd >like to see more of that in the actual group of people writing the specs, >not just the surrounding groups. There's an interesting lesson lurking in there. The original XML WG included implementors of Author/Editor, HoTMetaL, groff, SP, Jade, Pat/Lector, IBMIDDOC, Dynatext, Mosaic, and Grif. So Simon's (implied) theory that the specs would have been better, had the authoring group included implementors, stands on shaky ground. A couple of hypotheses that might explain this: - being an implementor is not a particularly strong qualification for writing specs - being a core-technology implementor, rather than a solution builder or system integrator, is not a particularly strong qualification for writing specs This group is notably and vocally dissatisfied with the specs, I am watching with attention for concrete suggestions as to how to make future specs better - the one premise that seems to get consensus, in this group at least, is "more examples". (Hmm, the namespace spec has tons). As regards the namespace spec, another hypothesis: - it might be easier to understand for people coming in from outside who aren't carrying around a bunch of SGML-derived expectations. And given that XML actually seems to be succeeding quite vigorously in the marketplace, a final hypothesis: - there is little relation between the presentation quality of a spec, in and of itself, and whether the world will welcome it (presumably we *do* believe that the quality of the design being spec'd does have some such relationship) My own personal take - the XML spec has holes that I'm more deeply aware of than anyone in the world, but it's a bearable compromise given the combined resource/time/political constraints - and the real-world problems with XML are not the spec itself, but SGML-derived bogosities like parameter entities. And as regards the namespace spec, I think that some people on this list are substantially full of shit, and are wilfully refusing to see how simple it is because it does not meet their own design prejudices. I think that spec is *way* better than the XML spec. Having said all that, people who write specs always have to try to do a better job next time, so this recent discourse is very very useful. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Sun Feb 7 19:23:16 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:08:47 2004 Subject: "Clean Specs" In-Reply-To: <3.0.32.19990207110448.00c1d440@pop.intergate.bc.ca> Message-ID: <199902071922.OAA17850@hesketh.net> At 11:05 AM 2/7/99 -0800, you wrote: >At 12:01 PM 2/7/99 -0500, Simon St.Laurent wrote: >>To a considerable extent this demands that spec writers see themselves as >>implementors - and probably that they include implementors in the process, >>especially implementors who don't have prior experience in whatever >>standards provided the foundation of the current project. The story for >>XML 1.0 of using Peter Murray-Rust as a canary is a good one, though I'd >>like to see more of that in the actual group of people writing the specs, >>not just the surrounding groups. > >There's an interesting lesson lurking in there. The original XML WG >included implementors of Author/Editor, HoTMetaL, groff, SP, Jade, >Pat/Lector, IBMIDDOC, Dynatext, Mosaic, and Grif. So Simon's (implied) >theory that the specs would have been better, had the authoring group >included implementors, stands on shaky ground. A couple of hypotheses >that might explain this: I think you missed a key point I made in the above paragraph - that the inclusion of implementors _without_ prior experience in the material being worked on is important. I'd argue quite heartily that the many years of implementation experience the WG brought to the table put them at a _disadvantage_ in writing specs that might be read by an audience without that level of prior experience: the non-SGML audience XML was supposedly to reach. >the one premise that seems to get >consensus, in this group at least, is "more examples". (Hmm, the >namespace spec has tons). Make them normative, document them heavily, and build on them in layers (step 1, then step 2 - same example, growing more complex), and you'll hear loud cheers from this corner. >- [namespaces] might be easier to understand for people coming > in from outside who aren't carrying around a bunch of SGML-derived > expectations. I wish, but I think a lot of people are getting stuck figuring out what XML's SGML-derived expectations are and then piling namespaces on top of it. >And as regards the namespace spec, I think that some people on this >list are substantially full of shit, and are wilfully refusing to see >how simple it is because it does not meet their own design prejudices. >I think that spec is *way* better than the XML spec. I don't think that's going to get you a lot of positive feedback from anyone, on this spec or future specs. Namespaces better than XML 1.0? Maybe if it didn't have to layer on top of XML 1.0. I'm looking forward to a future revision of XML where this can get cleanly integrated, and maybe that'll be the one worth judging. Simon St.Laurent XML: A Primer / Building XML Applications (March) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at dns.isogen.com Sun Feb 7 19:32:54 1999 From: eliot at dns.isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:08:47 2004 Subject: "Clean Specs" In-Reply-To: <3.0.32.19990207110448.00c1d440@pop.intergate.bc.ca> Message-ID: <3.0.5.32.19990207133327.00a67790@amati.techno.com> At 11:05 AM 2/7/99 -0800, Tim Bray wrote: >My own personal take - the XML spec has holes that I'm more deeply >aware of than anyone in the world, but it's a bearable compromise >given the combined resource/time/political constraints - and the >real-world problems with XML are not the spec itself, but SGML-derived >bogosities like parameter entities. I'm with Tim here: writing specs, and in particular, standards is difficult at best. And, like anything, it is a human activity, which means it will be flawed, by definition. The editors had a very hard job and did, IMNSHO, an excellent job within the constraints they had. And XML does suffer from things in SGML that are objectively not well designed (entities in SGML are just a mess generally). The XML spec had a particularly tough row to hoe: it had to be both as rigorous as possible and as easy-to-use as possible. These two things are generally not compatible, especially for a general audience. As Paul P. points out, if you don't already understand the formal notation of the spec, understanding the spec completely and unambiguously is hard. That the editors got close is a testament to their skill and tenacity. Compare, for example, the DSSSL spec, which is, in my opinion, one of the better technical standards I've worked with. James Clark relied heavily on formal notation and limited the prose, especially non-normative prose. This makes the DSSSL spec hard to learn DSSSL from initially (you've got to learn two new syntaxes: the formal production syntax and the expression language syntax), but once you learn them, there is little ambiguity. Cheers, E. -- <Address HyTime=bibloc> W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com </Address> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Feb 7 20:44:20 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:47 2004 Subject: "Clean Specs" Message-ID: <3.0.32.19990207124243.00ba7eb0@pop.intergate.bc.ca> At 02:25 PM 2/7/99 -0500, Simon St.Laurent wrote: >I think you missed a key point I made in the above paragraph - that the >inclusion of implementors _without_ prior experience in the material being >worked on is important. Actually, here's another piece of evidence that supports your position. Recently, in an effort to refresh my memory as to why something in XML 1.0 was the way it was, I went back and reviewed a whole bunch of the XML SIG mailing list correspondence from back in '97 while the important issues were being thrashed out. Things such as white-space handling and public identifiers and so on were being debated passionately by people who were obviously deeply erudite as to the pros and cons of the issues; in the thousands and thousands of emails, though, there is almost no input as to the structure and presentation of the XML spec - everyone was too focused on the content. I think there's *definitely* a lesson there. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jamesr at steptwo.com.au Sun Feb 7 22:18:37 1999 From: jamesr at steptwo.com.au (James Robertson) Date: Mon Jun 7 17:08:47 2004 Subject: "Clean Specs" In-Reply-To: <36BDAF1D.A908F8EF@prescod.net> References: <199902070827.BAA07349@malatesta.local> Message-ID: <4.1.19990208091225.00bbbca0@steptwo.com.au> At 01:19 8/02/1999 , Paul Prescod wrote: | What is a "clean spec?" | | People in this discussion are mixing together a variety of things that I | do not consider the same. | | Uche Ogbuji wrote: | > I have worked on teams implementing DOM, XSL and parts of XLL. Some users | > might grant that we have been "succeeding", but let me assure you that if so, | > it is despite the W3C specs, not because of them. DOM, especially is | > unforgivably inconsistent, incomplete, and unclear for a production-ready | > (1.0) specification. | | There is not a SINGLE person on this mailing list that would say that it | is right to create specifcations that are inconsistent, incomplete or | unclear. It is beyond doubt that specifications in the XML family, | including XML itself, have these problems. The question is how to avoid | that? Well, has anyone considered employing real, professional technical authors to write the specifications? Instead of (I presume) the progenitors of the ideas. I mean, the inventors of a standard are gurus in their technical area, but rarely would they have professional skills in communication and education ... So I say we ask some real experts ... in writing. Cheers, J ------------------------- James Robertson Step Two Designs Pty Ltd SGML, XML & HTML Consultancy http://www.steptwo.com.au/ jamesr@steptwo.com.au "Beyond the Idea" ACN 081 019 623 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jamesr at steptwo.com.au Sun Feb 7 22:23:20 1999 From: jamesr at steptwo.com.au (James Robertson) Date: Mon Jun 7 17:08:47 2004 Subject: RDF (was Re: Colonialism, SAX, Java, and Namespaces) In-Reply-To: <36BDB47A.35407949@prescod.net> References: <36BB0D56.EDF427B0@prescod.net> <8025670F.002EF9AD.00@mailhost.agora.co.uk> <199902051513.KAA01453@hesketh.net> <4.1.19990207105247.00c75ca0@steptwo.com.au> Message-ID: <4.1.19990208091613.00b80100@steptwo.com.au> At 01:42 8/02/1999 , Paul Prescod wrote: | James Robertson wrote: | > | > | > Now since the intention is to store a lot of data in | > RDF, how do we check that a RDF file is correct | > and meaningful? | > | > How do we validate it? | | There is such a thing as an "RDF schema". It checks an almost disjoint set | of things from what a DTD would check. If you don't care about things | being in a particular order and want to treat linking and containment as | the same thing, then you should use RDF schemas. If you do care about the | order of things, you should use DTDs. You could also use them together to | check different things. Maybe I just haven't woken up enough this morning yet, but I'm not quite sure what all of this means. So I'll restate the question: * I write an RDF document by hand. * I receive an RDF document from someone else. How do I know it makes sense? How do I know the right RDF tags have been used, and in a meaningful way? In otherwords, since an RDF document doesn't have a DTD (since it uses namespaces), how do I check it's right? J ------------------------- James Robertson Step Two Designs Pty Ltd SGML, XML & HTML Consultancy http://www.steptwo.com.au/ jamesr@steptwo.com.au "Beyond the Idea" ACN 081 019 623 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jamesr at steptwo.com.au Sun Feb 7 22:34:49 1999 From: jamesr at steptwo.com.au (James Robertson) Date: Mon Jun 7 17:08:47 2004 Subject: "Clean Specs" In-Reply-To: <3.0.32.19990207124243.00ba7eb0@pop.intergate.bc.ca> Message-ID: <4.1.19990208092325.00c4bc30@steptwo.com.au> At 06:43 8/02/1999 , Tim Bray wrote: | Actually, here's another piece of evidence that supports your position. | Recently, in an effort to refresh my memory as to why something in XML 1.0 | was the way it was, I went back and reviewed a whole bunch of the | XML SIG mailing list correspondence from back in '97 while the important | issues were being thrashed out. Things such as white-space handling | and public identifiers and so on were being debated passionately by | people who were obviously deeply erudite as to the pros and cons of the | issues; in the thousands and thousands of emails, though, there is | almost no input as to the structure and presentation of the XML spec - | everyone was too focused on the content. I think there's *definitely* | a lesson there. -Tim I think I'll make my point more strongly here, as I think Tim highlights this issue in his post. As a whole, we can be considered the "professionals" or "specialists" in the areas of XML, etc. However, there is a whole, much larger, group of professional people who are the experts in: writing stuff that makes sense. They are called technical authors. The XML WG wouldn't think of seriously discussing say, OS filesystem design. I would ask why they think they have the skills and qualifications to write a large and complex technical document, so that it makes sense? The specification document is read by the whole world, and is essentially the only medium that the WG uses to communicate its ideas widely. So it's a document worth doing properly, not something that should be written by technical specialists and amateur authors. Now, this point of view is very new to me as well, and it's come about due to having worked for the last year with a group of technical authors. And they really do know their stuff ... So from now on, I'm going to stick to designing and coding, and consult the experts when I need to write serious documentation. Just some thoughts for the day, J ------------------------- James Robertson Step Two Designs Pty Ltd SGML, XML & HTML Consultancy http://www.steptwo.com.au/ jamesr@steptwo.com.au "Beyond the Idea" ACN 081 019 623 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From liamquin at interlog.com Sun Feb 7 23:19:06 1999 From: liamquin at interlog.com (Liam R. E. Quin) Date: Mon Jun 7 17:08:47 2004 Subject: "Clean Specs" In-Reply-To: <3.0.32.19990207110448.00c1d440@pop.intergate.bc.ca> Message-ID: <Pine.BSI.3.96r.990207180946.29147B-100000@shell1.interlog.com> Quoth Tim Bray <tbray@textuality.com> [...] > This group is notably and vocally dissatisfied with the specs, I > am watching with attention for concrete suggestions as to how > to make future specs better - the one premise that seems to get > consensus, in this group at least, is "more examples". (Hmm, the > namespace spec has tons). Having just had a book on the XML spec published... I will say that the spec is one of the better thta I have seen, but that there is still a lot of scope for improvement. It was a really difficult process, and it's easy to forget that at the start, few who were involved in it expected it to generate the interest and fervour that it has. There was no expectation that any aspect of SGML would be changed, either, although in the end SGML *was* changed. But there is a problem with expectations, too. The spec is not an introduction. How many people here learnt C++ by reading the ANSI spec? How many people here learned to tune a radio by reading the international specifications for radio frequency allocation? Next time, develop tutorials alongside the specs perhaps. > My own personal take - the XML spec has holes that I'm more deeply > aware of than anyone in the world, but it's a bearable compromise > given the combined resource/time/political constraints - and the > real-world problems with XML are not the spec itself, but SGML-derived > bogosities like parameter entities. I agree 100%. Ian and I found lots of minor holes last year (asked about some here, sent some to the email address for corrections to the spec), but they are almost all minor. There are a couple of places where the wording says the opposite of what's intended, but where no sane person is likely to read it literally, and that's OK too. The namespace spec was pretty good last time i looked. I think it raises whole big looming purple-and-green questions which I shall try to write about separatesomely. Lee -- Liam Quin, GroveWare Inc., Toronto; The barefoot programmer l i a m q u i n at i n t e r l o g dot c o m http://www.interlog.com/~liamquin/ SGML/XML/Unix/C consulting and programming xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murray at muzmo.com Sun Feb 7 23:33:40 1999 From: murray at muzmo.com (Murray Maloney) Date: Mon Jun 7 17:08:47 2004 Subject: "Clean Specs" In-Reply-To: <4.1.19990208092325.00c4bc30@steptwo.com.au> References: <3.0.32.19990207124243.00ba7eb0@pop.intergate.bc.ca> Message-ID: <3.0.1.32.19990207183427.00bdd698@pop.uunet.ca> James' has reiterated a suggestion that has been put forward by others in the past -- notably Dan Connolly and Tim Berners-Lee. Sadly, professional writers are not always easy to come by, and when they are they usually expect compensation -- gotta make a living, dontcha know. A professional technical writer was hired by W3C for the HTML 4.0 specification. The XML WG did have at least one professional writer and one professional editor on board, and many of those who were not professional writers/editors could not fairly be categorized as amateurs given their academic and professional histories. In my experience as a technical writer, I have discovered that the act of explaining the application of a technical design can, in many cases, lay bare its flaws. Cooperation between writers and designers often results in designs that take the end-user into account. Regards, Murray xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at dns.isogen.com Sun Feb 7 23:40:13 1999 From: eliot at dns.isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:08:47 2004 Subject: "Clean Specs" In-Reply-To: <4.1.19990208092325.00c4bc30@steptwo.com.au> References: <3.0.32.19990207124243.00ba7eb0@pop.intergate.bc.ca> Message-ID: <3.0.5.32.19990207174048.007c69c0@amati.techno.com> At 09:31 AM 2/8/99 +1000, James Robertson wrote: >The XML WG wouldn't think of seriously discussing say, OS filesystem >design. I would ask why they think they have the >skills and qualifications to write a large and complex technical >document, so that it makes sense? > >The specification document is read by the whole world, and is essentially >the only medium that the WG uses to communicate its ideas >widely. So it's a document worth doing properly, not something that >should be written by technical specialists and amateur authors. The XML WG was an all-volunteer project, as are most standards efforts. Those of us who participated did so primarily as a personal commitment, not as something our employers (those of us who have them) pay us to do. Standards development is not a commercial process--there is no budget from which technical writers might be hired. The W3C only administers, it does not fund. Same for ISO. Some national bodies do fund some standards development (BSI, the British Standards Institute), but that funding will tend to be used to support the technologists developing the standard and not writers crafting the words. So while it's true that most, if not all, specifications could benefit from professional writers, it usually isn't an option for standards developers. About the most you can hope for is editors who are literate and capable. In the case of Tim and Michael, I think we got about as much literateness and capability as we could want. Note too that writing good standards is a very specialized art--it's very close to writing legal documents. Not all technical writers are skilled at it, even if they are good at other forms of technical writing. Cheers, E. -- <Address HyTime=bibloc> W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com </Address> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at dns.isogen.com Sun Feb 7 23:45:25 1999 From: eliot at dns.isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:08:47 2004 Subject: "Clean Specs" In-Reply-To: <3.0.1.32.19990207183427.00bdd698@pop.uunet.ca> References: <4.1.19990208092325.00c4bc30@steptwo.com.au> <3.0.32.19990207124243.00ba7eb0@pop.intergate.bc.ca> Message-ID: <3.0.5.32.19990207174651.00a8b680@amati.techno.com> At 06:34 PM 2/7/99 -0500, Murray Maloney wrote: > A professional technical writer was hired >by W3C for the HTML 4.0 specification. Didn't know that--puts the lie to what I just posted.... >The XML WG did have at least one professional writer and one >professional editor on board, and many of those who were not >professional writers/editors could not fairly be categorized >as amateurs given their academic and professional histories. > >In my experience as a technical writer, I have discovered that >the act of explaining the application of a technical design >can, in many cases, lay bare its flaws. Cooperation between >writers and designers often results in designs that take the >end-user into account. I agree--in writing my (as yet unfinished) book on HyTime (www.drmacro.com/bookrev) I uncovered numerous problems with the original standard, which helped tremendously during the writing of the 2nd edition of the standard. While reference implementations are always good, "reference tutorials" are at least as valuable. It would be nice if we always had enough resources and/or time to write the spec and the tutorial in parallel. Of course, often the person writing the spec is also the one to write the tutorial and there's only so many hours in a day.... Cheers, E. -- <Address HyTime=bibloc> W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com </Address> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Sun Feb 7 23:47:49 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:08:47 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) In-Reply-To: <no.id>; from Paul Prescod on Thu, Feb 04, 1999 at 11:14:32PM -0600 Message-ID: <19990208104624.A4862@io.mds.rmit.edu.au> On Thu, Feb 04, 1999 at 11:14:32PM -0600, Paul Prescod wrote: > "Borden, Jonathan" wrote: > > ...[lots and lots] ... and then ... So lets compare apples to > > apples. Which data access API do you wish to use? > > I don't want an API. I want layers of objects. At the bottom level I > have either storage objects or records in a table. At the higher > layers I have abstractions over those objects. Using objects I can > build a 1-tier, 2-tier, 3-tier or n-tier system. I can have as many > levels of business rules and clients and servers as I need. I can > also query objects and build object schemas using standardized, > multiply-implemented languages. I'm not clear on what you mean here, Paul. When one builds a multi-tier solution in the relational world, the lowest layer is always collections of tuples, usually SQL. Raw, an "dumb". To leverage this "stupid" layer, one will usually build a business layer above this, which may implement objects. This way you have the best of both worlds. You get a nice object oriented layer on top to talk to, and an industrial strength, robust repository underneath. Your comments give me the impression that this is unacceptable to you in the XML/heirarchical universe. You don't want DOM at any level. You insist on going straight to objects. It is not even good enough to build an object layer on top of the DOM layer. I find this a little implausible and hence am certain that you had something else in mind. Is it rather that you simply don't care what the underlying API is, that you are only interested in what happens at the object level? Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Mon Feb 8 00:02:30 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:08:47 2004 Subject: "Clean Specs" Message-ID: <005a01be52f6$218ace90$2ee044c6@arcot-main> Tim Bray wrote: >And as regards the namespace spec, I think that some people on this >list are substantially full of shit, and are wilfully refusing to see >how simple it is because it does not meet their own design prejudices. >I think that spec is *way* better than the XML spec. Tim, I am disappointed that you chose to follow this line of thinking. I happened to like the XML spec and found the Namespaces spec confusing. Is this because I am substantially full of shit? Maybe so. Even ignoring your seemingly habitual verbal abuse, I find it disconcerning that you seem to treat genuine feedback from the readers of your specs as some sort of legal arguments or logic problems which you are obligated to punch holes in. Please listen to what we are trying to say rather than how we say it. If some of us seems to criticize your works harshly, it is because we would like to believe that we are not entirely stupid. If we do not understand you fully, should we think that we are too stupid to understand or that your are speaking too high above our heads? Most of the people who took part in this thread of discussion took the middle view: we took as much as blame as possible until we were full of it and then shovel to you the rest. Are we full of it? You are damn right we are. I am pissed because I am full of it. If you are having trouble holding your share, that is too bad. Not really at my best, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Mon Feb 8 00:32:59 1999 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:08:47 2004 Subject: "Clean Specs" Message-ID: <003701be52fa$6efe1700$82acdccf@ix.netcom.com> Murray wrote <<In my experience as a technical writer, I have discovered that the act of explaining the application of a technical design can, in many cases, lay bare its flaws. >> I agree with this. The writer of the spec. presumably knows what they want to say, but often don't say it, and it's only in the process of explaining the spec that the flaw is laid bare. For example when I tried to explain this I realised that it was in fact meaningless. <<Non-standard extensions, when used, may change the behavior of functions or facilities defined by this recommendation. In such cases, the implementation documentation must define an environment in which a document can be parsed and rendered with the behavior specified by this recommendation.>> Cooperation between >writers and designers often results in designs that take the >end-user into account. And that surely is the whole idea! Frank Frank Boumphrey XML and style sheet info at Http://www.hypermedic.com/style/index.htm Author: - Professional Style Sheets for HTML and XML http://www.wrox.com CoAuthor: XML applications from Wrox Press, www.wrox.com Author: Using XML on the Web (March) ----- Original Message ----- From: Murray Maloney <murray@muzmo.com> To: XML-Dev Mailing list <xml-dev@ic.ac.uk> Sent: Sunday, February 07, 1999 6:34 PM Subject: Re: "Clean Specs" >James' has reiterated a suggestion that has been put forward >by others in the past -- notably Dan Connolly and Tim Berners-Lee. >Sadly, professional writers are not always easy to come by, and >when they are they usually expect compensation -- gotta make a >living, dontcha know. A professional technical writer was hired >by W3C for the HTML 4.0 specification. > >The XML WG did have at least one professional writer and one >professional editor on board, and many of those who were not >professional writers/editors could not fairly be categorized >as amateurs given their academic and professional histories. > >In my experience as a technical writer, I have discovered that >the act of explaining the application of a technical design >can, in many cases, lay bare its flaws. Cooperation between >writers and designers often results in designs that take the >end-user into account. > >Regards, > >Murray > > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Mon Feb 8 00:53:45 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:47 2004 Subject: RDF (was Re: Colonialism, SAX, Java, and Namespaces) References: <36BB0D56.EDF427B0@prescod.net> <8025670F.002EF9AD.00@mailhost.agora.co.uk> <199902051513.KAA01453@hesketh.net> <4.1.19990207105247.00c75ca0@steptwo.com.au> <4.1.19990208091613.00b80100@steptwo.com.au> Message-ID: <370BEC9C.2E4DF38B@prescod.net> James Robertson wrote: > > So I'll restate the question: > > * I write an RDF document by hand. > * I receive an RDF document from someone else. > > How do I know it makes sense? How do I know the right > RDF tags have been used, and in a meaningful way? > > In otherwords, since an RDF document doesn't have a DTD > (since it uses namespaces), how do I check it's right? Well you are wrong that an RDF document can't have a DTD. It can. But then you are checking the XML-validity of the document, not the RDF-validity. If the question is how to check an arbitrary RDF that using a particular RDF-described "vocabulary" in the same way that you can check HTML using the HTML DTD, you check the RDF document against its RDF schema. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Feb 8 01:13:57 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:47 2004 Subject: "Clean Specs" Message-ID: <3.0.32.19990207171050.00c2c870@pop.intergate.bc.ca> At 04:01 PM 2/7/99 -0800, Don Park wrote: >I am disappointed that you chose to follow this line of thinking. ... >Not really at my best, Yeah, I wasn't either yesterday. Apologies to everyone here for being nasty. Everyone in the XML activity takes the opinions expressed here very seriously. I gotta say, though, the XML spec now feels to me like a ramshackle compromise that only just barely works, while namespaces do one simple thing and nail it down tight as a drum. Here's how bad it is; I'm working on an Annotated namespaces, just like annotated XML - and I'm having serious difficulty figuring out what to write. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Mon Feb 8 01:13:58 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:08:47 2004 Subject: XML e-mail viewer was RE: CORBA's not boring yet. / XML in an OS? In-Reply-To: <0d2701be5270$824dbca0$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: <000401be52ff$a4cc8ec0$d3228018@jabr.ne.mediaone.net> Thanks to everyone who has sent e-mail to test-xmtp@jabr.ne.mediaone.net ... in response to many requests to provide the "viewer" for this e-mail including in-line images I have put up another demo application which I call the XMTP-Board. To use it, send an e-mail including attached jpeg/gif images to: mailto:xmtp-board@jabr.ne.mediaone.net To view the e-mail using IE5b2 and XSL, browse to: http://jabr.ne.mediaone.net/xmtp/listxmtp.asp?User=xmtp-board I haven't yet had time to test my XSL with XT or LotusXSL and given the current state of XSL incarnations, I expect that this will only work with IE5 ... the intention of course is to conform to legal XSL as this becomes defined. This is the system as described in http://www.xml.com/xml/pub/98/12/consult98a.html Jonathan Borden http://jabr.ne.mediaone.net > > There would be an application, for example, that got mail via POP or IMAP, > represented it in XML and then attached it a particular point in the > uberdocument. XSL could be used to sort the mail. XSL would also > be used to > view the mail. > > It's XML for the sake of it, but I think it would be fun to try out. > > James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murray at muzmo.com Mon Feb 8 01:44:15 1999 From: murray at muzmo.com (Murray Maloney) Date: Mon Jun 7 17:08:48 2004 Subject: "Clean Specs" In-Reply-To: <3.0.32.19990207171050.00c2c870@pop.intergate.bc.ca> Message-ID: <3.0.1.32.19990207204449.00a4cb9c@pop.uunet.ca> At 08:13 PM 2/7/99 -0500, Tim Bray wrote: >I gotta say, though, the XML spec now feels to me like a ramshackle >compromise that only just barely works, while namespaces do one simple >thing and nail it down tight as a drum. Here's how bad it is; I'm >working on an Annotated namespaces, just like annotated XML - and I'm >having serious difficulty figuring out what to write. -Tim Tim, I trust that the namespace spec makes perfect sense to you. But it does not make sense to me and many others. Take a step back and look/listen again. I sense that you are so close to it that you just don't see the monstrous chasms that others do. Either that "tight drum" is dischordant, or the rest of us are marching to a different one. Either way, it has not been successful in getting us in lock step. >From my point of view, namespaces is a "ramshackle compromise". I give you credit for appreciating that many/most of us aren't stupid. So take a clue from the reported 3:1 anti- vs pro-namespace email ratio. I am telling you, as a friend, colleague and a technical writer, that I cannot satisfactorily explain "Namespaces in XML". If you're having trouble deciding what to write in your annotated namespace spec, you might start with answers to the questions that have been posed on this list. Regards, Murray Murray Maloney, Esq. Phone: (905) 509-9120 Muzmo Communication Inc. Fax: (905) 509-8637 671 Cowan Circle Email: murray@muzmo.com Pickering, Ontario Email: murray@yuri.org Canada, L1W 3K6 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Mon Feb 8 02:03:27 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:08:48 2004 Subject: "Clean Specs" In-Reply-To: <3.0.1.32.19990207204449.00a4cb9c@pop.uunet.ca> Message-ID: <000901be5306$838bd0c0$d3228018@jabr.ne.mediaone.net> Murray Maloney wrote: > > Tim, I trust that the namespace spec makes perfect sense to you. > But it does not make sense to me and many others. Take a step > back and look/listen again. I sense that you are so close to it > that you just don't see the monstrous chasms that others do. ...> > From my point of view, namespaces is a "ramshackle compromise". > I give you credit for appreciating that many/most of us aren't stupid. > Is the problem here the content of the namespace spec or the way it is written? It seems to me that people who feel that the spec is a compromise, or wish the spec specified something different, are unhappy. If you really don't understand the spec, how can you claim it is a ramshackle compromise? I suspect that you understand it and don't like it. The problem doesn't appear to be the way the spec is written and a professional writer won't solve that problem. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Mon Feb 8 02:19:52 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:08:48 2004 Subject: "Clean Specs" In-Reply-To: Your message of "Sun, 07 Feb 1999 11:05:50 PST." <3.0.32.19990207110448.00c1d440@pop.intergate.bc.ca> Message-ID: <199902080222.TAA01694@malatesta.local> > This group is notably and vocally dissatisfied with the specs, I > am watching with attention for concrete suggestions as to how > to make future specs better - the one premise that seems to get > consensus, in this group at least, is "more examples". (Hmm, the > namespace spec has tons). Your polite chastisement is quite warranted: many of us have been complaining about the specs without offering "concrete" suggestions for improvement. I had this in mind when I posted my part, but, you see, the problems do not lie in any particular pattern. Difficulties I have come across in trying to understand the specs have been quite varied. Sometimes it appears the specs rely on several assumptions that might have been well discussed within the WG, but never made it to "paper". The DOM spec often reads that way. At other times, given examples and "clarifying" appendices have instead caused confusion, as in the Namespaces spec. In yet other cases, there appear to be several distinct functions conflated into one spec, for instance, the XSL spec, which is subject of a long, current thread in the XSL list where a large majority favor splitting it into two specs: one concerning transformation and one formatting specs. In the general case, there are many frictions between the various specs: they tend to overlap in some areas, sometimes in conflicting ways, and they tend to leave gaps in other areas. Now some of this might be inveterate whining on the part of non WG members, but I am comforted in seeing that many other intelligent readers have run into the same walls as I have. Certainly, the WG had excellent reasons for making certain choices that were bound to be unpopular. The main problem appears to be lack of communication between the WGs and outsiders. The W3C is certainly not the most inscrutable standards organization I've seen, but considering its influence over the Internet, supposedly a medium characterized by open and loud communication, it can often appear to be some shadowy group dominated by a clutch of large vendors handing down inevitably imperfect specs to outsiders, but not giving the outsiders much say in the improvement of the documents. Yes, I know that the whole "release early/release often" model of the W3C's putting up a series of drafts before the final recommendation is designed to incorporate outside input, and I'm sure feed-back from places such as this group is considered, but there is precious little communication from the W3C, IMHO, as to why some feed-back, regardless of consensus, does not appear to reflect on subsequent documents. To give an example of an even more complex, long-running, and politically charged standards effort, I'll recall the development of the ANSI C++ standard. I followed the standard very closely until I got disenchanted with the language, and with all the problems with that effort, one thing was clear: the public was _very_ involved, and there were very visible effects of this involvement. Tim Bray and James Clark have been admirable ambassadors from the W3C for the WGs with which they are involved, but I haven't felt the same give-and-take from many other members. Most ANSI committe members for the C++ standard, people such as P.J. Plauger, Tom Plum, Dan Saks, and Stroustroup himself, were very highly and visibly involved with C++ developers. Many ideas from outsiders were incorporated throughout the process, and not just from big-nickel companies: if I remember rightly, auto_ptr came from a bright individual. Before the draft was solidified, there was a long period for public comment, and there was much discussion of the comments that were received. I may be biased, but I haven't felt the same level of openness from the W3C, and it seems to me that many of the peopblems that people complain of in the specs have been pointed out many times, and still re-appear in subsequent editions of specs without any visible consideration of the complaints. The W3C often gets press to the effect that it's a small clique massed around the persoanlity of Tim Berners-Lee. I highly doubt that extremity, from what I've observed of people like Tim Bray and James Clark, but I'm not sure the W3C is doing as much as it reasonably can to dispel the (literal) FUD that surrounds many of its efforts. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Mon Feb 8 02:30:33 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:08:48 2004 Subject: "Clean Specs" Message-ID: <001301be530a$d52c4820$2ee044c6@arcot-main> >Yeah, I wasn't either yesterday. Apologies to everyone here for being >nasty. Everyone in the XML activity takes the opinions expressed here >very seriously. Thank you very much for taking my comments positively. I felt bad after writing it. So much for the power of e-mail. I would like to apologize for the flipant tone of my comments. >I gotta say, though, the XML spec now feels to me like a ramshackle >compromise that only just barely works, while namespaces do one simple >thing and nail it down tight as a drum. Here's how bad it is; I'm >working on an Annotated namespaces, just like annotated XML - and I'm >having serious difficulty figuring out what to write. -Tim I do realize that the XML spec is leaky in certain respect but I felt it is very clear as a whole although I am unable to point out exactly what makes it so. Perhaps it was the difference in the the nature of the problems. The namespace problem is indeed very complex as you pointed out before. There are many solutions I can think of but I can't think of a single solution that addresses all the issues without looking awkward in some respect. I look forward to your Annotated Namespaces. Sitting awkwardly, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Mon Feb 8 02:36:12 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:08:48 2004 Subject: "Clean Specs" In-Reply-To: Your message of "Sun, 07 Feb 1999 18:23:03 EST." <Pine.BSI.3.96r.990207180946.29147B-100000@shell1.interlog.com> Message-ID: <199902080238.TAA01731@malatesta.local> > The spec is not an introduction. How many people here learnt C++ > by reading the ANSI spec? How many people here learned to tune a > radio by reading the international specifications for radio > frequency allocation? I keep reading these challenges, first aboout Java, and now about C++. Well, just as I first learned Java from Sun's specs, I also happened to have mostly learned C++ from Stroustroup's Annotated Reference Manual, which is as close as C++ had to a spec for quite a while. Now true enough, as Paul Prescod points out, it is quite another matter to learn a completely foreign language from a spec: I was already very familiar with C, Smalltalk and somewhat familiar with Objective-C before tackling C++, and I was very conversant with C++ before tackling Java, but I don't think it's at all freakish to learn a language or system from a well-written spec. I happen to like formalisms, and although it probably takes me much longer to learn a new system as the seven days of the "dummies" books, I am happier in my masochism. Here's an example: I've just spent a good part of today and yesterday wading through the spec for the CORBA object transaction service for implementation in a project (we're using an ORB that doesn't support OTS). My brain might be about to explode, but I think I can get some useful work done now. I'm sure I could have found an intro from Orfali and Harkey somewhere with cute cartoons of aliens explaining the protocol for a two-phase commit, but I'm usually doubtful about what I really know after such tutorials. So in short: there is nothing wrong about trying to learn from a well-written spec. My problem with some W3C specs is not complexity (in fact, they are probably the most straightforward specs I've read). It's more typically, as I've said before, inconsistency, incompleteness, and unclearness (in the sense of "ambiguity" rather than "abstruseness"). -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Mon Feb 8 02:45:22 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:08:48 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) In-Reply-To: Your message of "Mon, 08 Feb 1999 10:46:25 +1100." <19990208104624.A4862@io.mds.rmit.edu.au> Message-ID: <199902080247.TAA01750@malatesta.local> > Your comments give me the impression that this is unacceptable to you > in the XML/heirarchical universe. You don't want DOM at any level. > You insist on going straight to objects. It is not even good enough > to build an object layer on top of the DOM layer. I find this a > little implausible and hence am certain that you had something else in > mind. Is it rather that you simply don't care what the underlying API > is, that you are only interested in what happens at the object level? I hope I'm not mis-representing Paul here, but as I've always read him (and agreed), his point is that XML, and the various ancillary technologies such as DOM and XML Schema, are more appropriate for content-exchange than for core business-object modeling. I don't think it makes sense to build a business-object model on top of DOM, but I do think it makes sense to define an exchange protocol that selializes objects to XML representations using DOM as a programmatic interface. Hopefully in this effort, one can leverage the support of such technologies as WDDX and XML-RPC, especially as they develop closer ties with other object technologies, including the locally much-maligned CORBA. I think it also makes sense to use the DOM to develop a user-interface layer for such objects, possibly using the same WDDX or XML-RPC mappings in association with a set of style-sheets (although this is just one of many possible mechanisms). -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Mon Feb 8 03:18:43 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:08:48 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) In-Reply-To: <199902080247.TAA01750@malatesta.local> Message-ID: <000e01be5311$11ed8200$d3228018@jabr.ne.mediaone.net> > Uche Ogbuji wrote: > > > Your comments give me the impression that this is unacceptable to you > > in the XML/heirarchical universe. You don't want DOM at any level. > > You insist on going straight to objects. It is not even good enough > > to build an object layer on top of the DOM layer. I find this a > > little implausible and hence am certain that you had something else in > > mind. Is it rather that you simply don't care what the underlying API > > is, that you are only interested in what happens at the object level? > > I hope I'm not mis-representing Paul here, but as I've always > read him (and > agreed), his point is that XML, and the various ancillary > technologies such as > DOM and XML Schema, are more appropriate for content-exchange > than for core > business-object modeling. Ah, yes but realize that business object modelling is done at a higher level than the relational table (which is a data structure). XML is the serialization of the DOM tree based data structure. When business object need to employ tree based data structures, they may choose to store these in XML serializations. Another option would be for these business objects to interact with a database through the same DOM interfaces. This way the business object layer is isolated from the details of the storage layer. > > I don't think it makes sense to build a business-object model on > top of DOM, No doubt, if you are dealing with tabular data you ought stick with a relational model and use recordsets. If my business objects find it convenient or otherwise useful to employ DOM interfaces, who are you to suggest otherwise? The issue is one of performance. If a low memory, high performance DOM implementation were to appear I guarentee this would be found quite useful. For example, if you decide to build a business object model on top of ODI's eXcelon, which is described as providing a DOM interface, then you would indeed be building a business-object model on top of DOM. If Microsoft, Oracle, POET, Sybase, IBM, Informix etc etc. come out with high performance native DOM interfaces on their databases then you would have effectively isolated your business-object model from the database vendor. The point is that the existence of the DOM doesn't eliminate the need to built business objects, yet it still has an important place as a standards based interface onto trees (i.e. its the closest thing to groves we are likely to see in widescale use). Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Mon Feb 8 03:24:16 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:08:48 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) Message-ID: <000601be5311$d23e4bc0$c9a8a8c0@thing2> This seems similar to the conclusions I've reached. On the one hand, most processes can be driven by a set of object created directly from the SAX events, just using a filter or two to do the transformation. On the other hand, when there is information which must alternately exist as a set of application objects or an XML document, then the DOM seems to be appropriate for modeling the structure of the document. This is in addition to applications where navigating the document is important. The proposal here is to have a common api, MDSAX or something like it, which can serve as a framework for transforming XML into the appropriate application objects; and to have a set of filters that plug into that framework for creating a DOM, when appropriate. (I have a new hammer. Forgive me if I experiment a bit with nails, screws, pieces of wire, and thumb tacks. I'm not sure what the limits of this tool are just yet.) Bill From: uche.ogbuji@fourthought.com <uche.ogbuji@fourthought.com> >I don't think it makes sense to build a business-object model on top of DOM, >but I do think it makes sense to define an exchange protocol that selializes >objects to XML representations using DOM as a programmatic interface. >Hopefully in this effort, one can leverage the support of such technologies as >WDDX and XML-RPC, especially as they develop closer ties with other object >technologies, including the locally much-maligned CORBA. > >I think it also makes sense to use the DOM to develop a user-interface layer >for such objects, possibly using the same WDDX or XML-RPC mappings in >association with a set of style-sheets (although this is just one of many >possible mechanisms). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murray at muzmo.com Mon Feb 8 04:24:58 1999 From: murray at muzmo.com (Murray Maloney) Date: Mon Jun 7 17:08:48 2004 Subject: "Clean Specs" In-Reply-To: <000901be5306$838bd0c0$d3228018@jabr.ne.mediaone.net> References: <3.0.1.32.19990207204449.00a4cb9c@pop.uunet.ca> Message-ID: <3.0.1.32.19990207232539.009b558c@pop.uunet.ca> At 08:58 PM 2/7/99 -0500, Borden, Jonathan wrote: > Is the problem here the content of the namespace spec or the way it is >written? It seems to me that people who feel that the spec is a compromise, >or wish the spec specified something different, are unhappy. If you really >don't understand the spec, how can you claim it is a ramshackle compromise? >I suspect that you understand it and don't like it. > > The problem doesn't appear to be the way the spec is written and a >professional writer won't solve that problem. I can claim that it is a ramshackle compromise because I was witness to its creation. The process stunk to high heaven. The result is an awful compromise, and not because I don't like it. As I have said before, the "Namespaces in XML" spec would have more properly been named "Namespaces in RDF". Then I might be able to understand it. It is not the words in the spec that I do not understand. It is that I cannot fathom the logic that holds it together. For example, an appendix purports to describe a fictional XML namespace in which there exist "global" attributes which *cannot* be found in XML -- I defy you to locate a *global* attribute in XML. Or how about names that do not exist in any namespace? Is there an XML namespace or not? Sure, there are parts of the spec that I do not like. And I have done what I could to effect change. But the spec as a whole is not something that I can explain, justify and motivate. Regards, Murray Murray Maloney, Esq. Phone: (905) 509-9120 Muzmo Communication Inc. Fax: (905) 509-8637 671 Cowan Circle Email: murray@muzmo.com Pickering, Ontario Email: murray@yuri.org Canada, L1W 3K6 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Feb 8 04:31:09 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:08:48 2004 Subject: "Clean Specs" In-Reply-To: <3.0.32.19990207171050.00c2c870@pop.intergate.bc.ca> from "Tim Bray" at Feb 7, 99 05:13:16 pm Message-ID: <199902080521.AAA21275@locke.ccil.org> Tim Bray scripsit: > I'm > working on an Annotated namespaces, just like annotated XML - and I'm > having serious difficulty figuring out what to write. -Tim Simple. Explain what namespaces 1) aren't; 2) aren't good for. -- John Cowan cowan@ccil.org e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at dns.isogen.com Mon Feb 8 04:49:34 1999 From: eliot at dns.isogen.com (W. Eliot Kimber) Date: Mon Jun 7 17:08:48 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) In-Reply-To: <000e01be5311$11ed8200$d3228018@jabr.ne.mediaone.net> References: <199902080247.TAA01750@malatesta.local> Message-ID: <3.0.5.32.19990207224950.00809b00@amati.techno.com> At 10:13 PM 2/7/99 -0500, Borden, Jonathan wrote: > > Ah, yes but realize that business object modelling is done at a higher >level than the relational table (which is a data structure). XML is the >serialization of the DOM tree based data structure. When business object >need to employ tree based data structures, they may choose to store these in >XML serializations. Another option would be for these business objects to >interact with a database through the same DOM interfaces. This way the >business object layer is isolated from the details of the storage layer. If I understand you correctly, you're saying: 1. XML documents are serializations of things (such as business objects) 2. The DOM is the abstraction of that serialization 3. Processing systems may need to turn the serialization back into their original objects 3. Therefore we should pretend that relational databases are really DOM trees. This doesn't make any sense to me. Why use the DOM (or any other abstraction of XML documents--I'm not picking on the DOM in particular here) for direct access to business object data? Why not access those objects directly? Or maybe we're talking about what's happening at different layers in the system. Here's the way I view the scenario: I start with a business object: "airplane". I model it abstractly: airplane => [fuselage, wing, tail, cockpit] I then want to create instances of airplanes: I write IDL (or EXPRESS or ...) definitions of my business objects that directly reflect their properties: // NOTE: Phoney IDL interface Airplane { Part Fuselage; Part Wing; Part Tail; Part Cockpit; }; I then have somebody implement some objects to this interface. How do these objects store their data? Don't care. How do they serialize their data? Don't care. Can I use the DOM to access these objects? Of course not--these are airplanes, not documents--the DOM isn't relevant. Now, I put my object implementor hat on: I have to implement this Airplane interface. I think: what technology do I have to store lots of fiddly bits? Do I think "XML"? Maybe. Do I think "relational databases"? Almost certainly. Do I think "object databases"? Quite probably. If I think "XML", why would I think it and what would I get? One reason might be: "hey, I can serialize this stuff to disk using a standard syntax and abstraction--that could make it really easy to use free tools and protect my data through a standard I don't have to pay for the right to use." But then I think "oh, but XML's model might not be a good match for my data structures--might incur a lot of serialization/deserialization overhead." I ponder for a bit. "Let's look at relational databases again--they're fast. I still have to serialize and deserialize, but that technology is mature and I can hire SQL geeks in a heartbeat." I have a Coke. "But wait--object techology looks pretty good too--I could just implement directly to my interfaces and cut out the middle layer. I could still serialize for interchange--I might even get that for free from the OODB vendor." Ok, object databse it is. I program away, happy as a clam. The system works and it is a joy [this is a story, remember]. Now I say, "hey, let's try this XML serialization jabby the vendor provides, wonder what I'll get?" I push the "dump to XML" button. What does it look like? It's ugly--I've got no idea what they were thinking. Angle brackets are swimming before my eyes. But, I know I can suck it back it in, supposedly without loss. I try it--hey presto, my data's back. Cool. Why is this last bit the case? Because there are infinitely many ways to serialize a given set of abstract objects, so only the serializer knows how to do the deserialization. In any case, it's a strong chance that the gap between the XML structure and the business object structure will be at least two levels of abstraction (depending on whether the serialization is late or early bound), if not more. Thus, (and here's the point I've been trying to make from the beginning, so listen closely)... ...wait for it... ...The XML you get out of such as system isn't your business objects--its an arbitrary serialization of the internal representation of your business objects. Using the DOM (that is, an in-memory abstraction of *documents*) as the basis for direct business object access is simply nuts. This is not to say that fundamentally-hierarchical graph-based data models aren't useful for representing business objects--certainly they are (or we wouldn't be bothering to build generalized grove-management systems nor would we have used groves to prepresent HyTime's own business objects). But the DOM, in particular, is not a generalized data structure--it's a way of representing *XML documents* in memory AND NOTHING ELSE. So unless by "DOM" you mean "any fundamentally hierarchical graph representation of data", it's nonsense to talk about using the DOM as the API for data objects--the most that can mean is that *for your flavor of serialization you've defined functions that do the deserialization as an application of the DOM*. Which is fine, but it's not the same as *USING THE DOME FOR DATA ACCESS*, because it's not different from implementing your business objects on top of some other data storage technology. If you do mean graphs, then the DOM, in particular, isn't what you want because it's not generalized--it's a highly optimized, use-specific object model for XML documents. Good for it's purpose but not generalized. It also lacks a more general model of which it is an application. Or said another way: there's no magic in the DOM (or groves or XML) that will make storing and managing business objects easier. What will help are standardized serialization definitions, such as XMI or the new STEP XML Representation work item. But these only limit the number of instances of translation layers that have to be written--they don't eliminate the need for translation between the business object models and their serializations. Cheers, E. -- <Address HyTime=bibloc> W. Eliot Kimber, Senior Consulting SGML Engineer ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 75202. 214.953.0004 www.isogen.com </Address> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Feb 8 04:59:53 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:48 2004 Subject: "Clean Specs" Message-ID: <3.0.32.19990207205824.00bed660@pop.intergate.bc.ca> At 11:25 PM 2/7/99 -0500, Murray Maloney wrote: >I can claim that it is a ramshackle compromise because I was >witness to its creation. The process stunk to high heaven. >The result is an awful compromise, and not because I don't >like it. In fact, Murray disagrees so strongly with what the spec *says* (often, and on the record) that he is probably not the best judge of how well it says it. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Mon Feb 8 05:30:43 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:08:48 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) In-Reply-To: Your message of "Sun, 07 Feb 1999 22:49:50 CST." <3.0.5.32.19990207224950.00809b00@amati.techno.com> Message-ID: <199902080533.WAA02032@malatesta.local> > Or said another way: there's no magic in the DOM (or groves or XML) that > will make storing and managing business objects easier. I think this is the crux of the matter, and exactly what the "XML is not a universal hammer" folks (myself included) have been trying to get across. I use XML heavily along-side CORBA, and (by way of weak excuse/apology for my rather ad hominem attack on Dave Winer yesterday) I do get very frustrated when people champion using XML to replace very effective and general object-management systems. As much as I hate analogies, I'll hazard one: if you sell people Aspirin as a panacea, and they later on find out that it doesn't cure warts, cancer, dropsy or a crack habit, they might become sour enough to forget that it was actually a very effective pain-reliever. Then again, as Jonathan Borden put it: who am I? -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Mon Feb 8 06:25:19 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:48 2004 Subject: "Clean Specs" References: <3.0.32.19990207205824.00bed660@pop.intergate.bc.ca> Message-ID: <36BE830B.CBBE6552@infinet.com> Tim Bray wrote: > At 11:25 PM 2/7/99 -0500, Murray Maloney wrote: > >I can claim that it is a ramshackle compromise because I was > >witness to its creation. The process stunk to high heaven. > >The result is an awful compromise, and not because I don't > >like it. > > In fact, Murray disagrees so strongly with what the spec *says* > (often, and on the record) that he is probably not the best judge > of how well it says it. -Tim Well who is the best judge then? I thought that standards bodies were largely in existence to promote concensus on matters which companies and organizations disagree upon. Rather than bring everyone together, this entire "Namespaces in XML" recommendation has splintered the entire XML community. By that fact alone, the W3C is not doing a good job as a standards body for the internet. I am a forgiving person when it comes to making one, maybe two complete blunders (such as the case with "Namespaces in XML"), but many people are not as forgiving as I. Most of these people don't post to this list or even subscribe to it. They would just look at "Namespaces in XML" and then quietly go back to their current vendor specific solution for their web-publishing and e-commerce needs and forget the draft ever existed. The same goes for recommendations like XSL which are polluted with "Namespaces in XML" as well. They are the real "silent majority" that the W3C seems to have complete disdain for. The simple truth is that if the W3C does not behave more sensitive to criticism in the future and conduct itself in a more utilitarian manner, or at least make a change in the leadership of the organization, people like me and many others will clamor for creating another internet standards body that is not as slow in adopting standards as ISO or ANSI, but is not as obstinate as the W3C. Of course "Namespaces in XML" has to me been the only super-major screwup in the W3C's short life as a budding internet standards organization. I guess the question now is whether or not the W3C and its members have the courage to make the necessary changes. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Mon Feb 8 06:38:33 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:08:49 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) In-Reply-To: <3.0.5.32.19990207224950.00809b00@amati.techno.com> Message-ID: <001501be532c$ffeed060$d3228018@jabr.ne.mediaone.net> W. Eliot Kimber wrote: > > > At 10:13 PM 2/7/99 -0500, Borden, Jonathan wrote: > > > > Ah, yes but realize that business object modelling is done > at a higher > >level than the relational table (which is a data structure). XML is the > >serialization of the DOM tree based data structure. When business object > >need to employ tree based data structures, they may choose to > store these in > >XML serializations. Another option would be for these business objects to > >interact with a database through the same DOM interfaces. This way the > >business object layer is isolated from the details of the storage layer. > > If I understand you correctly, you're saying: > > 1. XML documents are serializations of things (such as business objects) ok. > 2. The DOM is the abstraction of that serialization not my statement. the DOM is an interface onto a tree structure. You are hung up on the term serialization which is specific to files and other flat persistance formats. What I am talking about is persistence in general. An object implementing a DOM interface may be persisted from a file or HTTP stream in which case it is built from a serialization, it may also be persisted from an ODB in which case it is *not* persisted from a serialization. Both the file and the ODB can be used by the DOM. > 3. Processing systems may need to turn the serialization back into their > original objects processing systems may need to obtain heirarchical data from storage systems. these systems need not store information in serial format. again: persistence. > 3. Therefore we should pretend that relational databases are really DOM > trees. no. if the data is tabular then use a recordset. in the specific cases when 1) we are storing data which is naturally hierarchical. 2) when the data needs to interface with systems which for other reasons employ DOM interfaces e.g. my XSL processor us built on a DOM interface and I wish to query the database using XQL (which happens to be built into my XSL processor in this example), it is more convenient to interface to the data using DOM interfaces than it is using recordsets (i.e. tabular data). I am saying over and over: if the data is relational, use recordsets but when the data is hierarchical DOM interfaces provide less of an impedence mismatch onto the data. > > This doesn't make any sense to me. Why use the DOM (or any other > abstraction of XML documents--I'm not picking on the DOM in particular > here) for direct access to business object data? We are NOT talking about direct access to business objects rather the mechanism by which business object talk to the database. The business object tier is above the data object tier. > Why not access those > objects directly? Or maybe we're talking about what's happening at > different layers in the system. yes! yes! yes! > > Here's the way I view the scenario: > > I start with a business object: "airplane". I model it abstractly: > > airplane => [fuselage, wing, tail, cockpit] > > I then want to create instances of airplanes: I write IDL (or EXPRESS or > ...) definitions of my business objects that directly reflect their > properties: > > // NOTE: Phoney IDL > interface Airplane { > Part Fuselage; > Part Wing; > Part Tail; > Part Cockpit; > }; > :-))) Part fuselage is really a structure: interface Airplane { Fuselage f; Wing wleft; Wing wright; Tail t; Cockput c; }; interface Fuselage { Strut strut; X-assembly x; Y-assembly y; }; interface Wing { ... }; and so on, now suppose each airplane has different Fuselage, Wings, Tails, Cockpits; and suppose each of these are build via 10 sub-parts and so on 50 levels deep until we get to sheet metal, screws and wires. An airplane is a complex piece of equipment. > I then have somebody implement some objects to this interface. > How do these > objects store their data? Don't care. How do they serialize their data? > Don't care. Since you appear to be the CEO of the aircraft company, who cares? Why not just have someone design the plane, implement it, test it and build it. Who cares about databases or even computers? If you don't care you don't have an airplane (or the plans for one). Someone has to care about the details. Objects typically don't just 'store data' into databases. Even with ODBMS there is an interface/API onto the DB (this can be base classes in C++ etc. different for each DB) > Can I use the DOM to access these objects? Of course > not--these > are airplanes, not documents--the DOM isn't relevant. Ok suppose I have a set of airplanes lets try this two ways: First with the DOM (stylized): NodeList airplanes_data = container.getElementsByTagName("airplane"); ok now build your business object (this is where you can spend your time). Now with SQL: Recordset rs = conn.Execute("select * from airplanes,fuselages,wings,tails,cockpits,x-assembly,y-assemblies, .... about 3^10 total tables here (assuming 10 levels deep) .... screws,sheets,wires where .....); Alternatively you can write out 3^10 individual select statements. After a few weeks/months of work you can start working on your business object. Arguably, when using an ODBMS this example would be more straightforward (but you picked RDBMS). The problem is that there is no standard, language independent interface onto ODBMS's. The DOM, while not the perfect interface *is* standard, and this is the big utility. ... > Why is this last bit the case? Because there are infinitely many ways to > serialize a given set of abstract objects, so only the serializer > knows how > to do the deserialization. In any case, it's a strong chance that the gap > between the XML structure and the business object structure will be at > least two levels of abstraction (depending on whether the serialization is > late or early bound), if not more. > > Thus, (and here's the point I've been trying to make from the > beginning, so > listen closely)... > > ...wait for it... > > ...The XML you get out of such as system isn't your business objects--its > an arbitrary serialization of the internal representation of your business > objects. Using the DOM (that is, an in-memory abstraction of *documents*) > as the basis for direct business object access is simply nuts. > Actually not even this. First I'm never actually dealing with XML, I've only shown DOM interfaces. My business objects internally use DOM interfaces to interact with a bit-bucket. Where does an XML document come into play here? > This is not to say that fundamentally-hierarchical graph-based data models > aren't useful for representing business objects--certainly they are (or we > wouldn't be bothering to build generalized grove-management systems nor > would we have used groves to prepresent HyTime's own business > objects). But > the DOM, in particular, is not a generalized data structure--it's a way of > representing *XML documents* in memory AND NOTHING ELSE. Err, no. I am saying that I can use the DOM to represent hierarchical data. This data *can* be expressed, serialized, as an XML document, but between my database and my business object, there need never exist an XML document. Say whatever you please but if I have a piece of code from James Clark (e.g. Jade/SP/groveoa) or Microsoft or IBM, I'm quite free to use it as I see fit. For example, I get to say (using 'extended DOM'): NodeList anotherSet = airplanes.selectNodes("airplane[@color='red' and .//screw/thread/@pitch = 64]"); to select all red airplanes with screws having a pitch=64... Have you written alot of programs which directly access databases? Do you ever have to code the objects which access the databases? If you stay up in la la object modelling land, you may not appreciate what I am saying. In working with this stuff, I am finding that I am more efficient, and I can get work done more quickly using these interfaces. ... > > Or said another way: there's no magic in the DOM (or groves or XML) that > will make storing and managing business objects easier. What will help > are standardized serialization definitions, such as XMI or the > new STEP XML > Representation work item. But these only limit the number of instances of > translation layers that have to be written--they don't eliminate the need > for translation between the business object models and their > serializations. > XMOP for example (http://jabr.ne.mediaone.net/documents/xmop.htm) is a way to serialize arbitrary COM objects using their typeinfo metadata. XMOP is a layer that can persist objects into either a) a stream (serialization) b) direct-to-DOM. When I attempted to design a direct-to-Recordset persistence interface on XMOP I found that I had to essentially develop a DOM<->Relational mapping. This is because arbitrary objects can be modelled in a hierarchical fashion (e.g. serialized to XML). In another example, using the medical imaging DICOM protocol (a complex property based protocol) I have developed a mapping to the Microsoft PropertySet format (used with Index Server). This mapping is not clean (at all given the inability to represent certain DICOM structures as PROPVARIANTs). This causes similar problems in mapping the protocol to a relational database (the workaround is to use binary data). Using XML and the DOM was a piece of cake to solve this difficult problem. So, I'm not saying that this is the cure for all the world's problems or that this is a hammer and all the world is a nail, but on the other hand, when you have a hammer in your hand, and you see a nail, take the shot. The simple fact is that the uses of the DOM interfaces are determined not by their designers rather by the creativity of those individuals who use them. The original CPU was designed to be a calculator. Use your imagination. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Feb 8 06:45:15 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:49 2004 Subject: "Clean Specs" Message-ID: <3.0.32.19990207224352.00bec800@pop.intergate.bc.ca> At 01:24 AM 2/8/99 -0500, Tyler Baker wrote: >Well who is the best judge then? I thought that standards bodies were largely in existence to >promote concensus on matters which companies and organizations disagree upon. Rather than >bring everyone together, this entire "Namespaces in XML" recommendation has splintered the >entire XML community. Uh, just for the record, Namespaces, like any other W3C recommendation, has been through a *long* formal process with many public drafts, and a final poll of the membership. Yes, there are those who disagree, but this is true of virtually every recommendation; there were those who wanted to send XML 1.0 back for more work - same with every other significant W3C product. Consensus in the pure form is never achieved in any standards organization. Operationally, the closest you can get is a determination that all substantive objections have been thoroughly listened-to, and a finding that an overwhelming majority of the community wants to move forward. >I am a forgiving person It doesn't particularly show. >They are the >real "silent majority" that the W3C seems to have complete disdain for. Well, the Mozilla, perl, Internet Explorer, Oracle, and IBM XML offerings already include namespace support. The majority is awfully silent I guess. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murray at muzmo.com Mon Feb 8 07:15:40 1999 From: murray at muzmo.com (Murray Maloney) Date: Mon Jun 7 17:08:49 2004 Subject: "Clean Specs" In-Reply-To: <3.0.32.19990207224352.00bec800@pop.intergate.bc.ca> Message-ID: <3.0.1.32.19990208021503.00ed352c@pop.uunet.ca> At 01:44 AM 2/8/99 -0500, Tim Bray wrote: >Uh, just for the record, Namespaces, like any other W3C recommendation, >has been through a *long* formal process with many public drafts, and a >final poll of the membership. For the record, "Namespaces in XML" did not follow a process like any other W3C recommendation. It is disingenuous to suggest that it did. In fact, this spec was not even subject to the scrutiny of a W3C Working Group from August, 1998. [...] > >Well, the Mozilla, perl, Internet Explorer, Oracle, and IBM XML offerings >already include namespace support. The majority is awfully silent I >guess. And where is all of the content? Hmmm! They do seem to be silent. Regards, Murray Murray Maloney, Esq. Phone: (905) 509-9120 Muzmo Communication Inc. Fax: (905) 509-8637 671 Cowan Circle Email: murray@muzmo.com Pickering, Ontario Email: murray@yuri.org Canada, L1W 3K6 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murray at muzmo.com Mon Feb 8 07:15:40 1999 From: murray at muzmo.com (Murray Maloney) Date: Mon Jun 7 17:08:49 2004 Subject: "Clean Specs" In-Reply-To: <3.0.32.19990207205824.00bed660@pop.intergate.bc.ca> Message-ID: <3.0.1.32.19990208020443.00ed65ec@pop.uunet.ca> At 11:59 PM 2/7/99 -0500, Tim Bray wrote: >At 11:25 PM 2/7/99 -0500, Murray Maloney wrote: >>I can claim that it is a ramshackle compromise because I was >>witness to its creation. The process stunk to high heaven. >>The result is an awful compromise, and not because I don't >>like it. > >In fact, Murray disagrees so strongly with what the spec *says* >(often, and on the record) that he is probably not the best judge >of how well it says it. -Tim The obverse of that logic would lead one to conclude that Tim is not the best judge either. Let's try a bit harder than that, eh? Regards, Murray Murray Maloney, Esq. Phone: (905) 509-9120 Muzmo Communication Inc. Fax: (905) 509-8637 671 Cowan Circle Email: murray@muzmo.com Pickering, Ontario Email: murray@yuri.org Canada, L1W 3K6 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Mon Feb 8 07:57:56 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:49 2004 Subject: "Clean Specs" References: <3.0.32.19990207224352.00bec800@pop.intergate.bc.ca> Message-ID: <36BE98B1.73223E62@infinet.com> Tim Bray wrote: > At 01:24 AM 2/8/99 -0500, Tyler Baker wrote: > >Well who is the best judge then? I thought that standards bodies were largely in existence to > >promote concensus on matters which companies and organizations disagree upon. Rather than > >bring everyone together, this entire "Namespaces in XML" recommendation has splintered the > >entire XML community. > > Uh, just for the record, Namespaces, like any other W3C recommendation, > has been through a *long* formal process with many public drafts, and a > final poll of the membership. Yes, there are those who disagree, but This is interesting. Namespaces in XML went from being a proposal with the PI based approach, to a draft, and then to a recommendation. There was only one draft in the period between proposal and recommendation. This I find puzzling compared to the revisions I saw taking place with XML back in the November of 1997. > this is true of virtually every recommendation; there were those who > wanted to send XML 1.0 back for more work - same with every other > significant W3C product. Consensus in the pure form is never achieved > in any standards organization. Operationally, the closest you can > get is a determination that all substantive objections have been > thoroughly listened-to, and a finding that an overwhelming majority > of the community wants to move forward. This is very true. However, it is hard to believe that the great majority of people on the "Namespaces in XML" WG could have views which are in fundamental disagreement. Since votes on these matters are apparently secret, I guess people like me will never know. Moreover, we will never know who is ultimately accountable for these decisions. It is as if the W3C is this omnipotent force that feels they do not need to answer to the developer community at large because the are not accountable. > >I am a forgiving person > > It doesn't particularly show. I am still here actively using XML and discussing XML issues. If I was truly not forgiving I would not waste any more time discussing these sort of issues that the W3C has brought to our attention. I and other developers, XML users, and those considering using XML in their data-processing infrastructure who constribute to this list do so freely. We don't charge the W3C for our consultation (well I guess some of the people here do) so our comments should be taken objectively. This entire business about us developers making some comments and then hoping that the W3C takes them into consideration is a rather feudal concept if you ask me. > >They are the > >real "silent majority" that the W3C seems to have complete disdain for. > > Well, the Mozilla, perl, Internet Explorer, Oracle, and IBM XML offerings > already include namespace support. The majority is awfully silent I > guess. > -Tim That is not the point. These products may support them (some of them I can say support them in rather useless ways as far as the application developer should be concerned), but the real issue here is whether or not "Namespaces in XML" nicely complements XML or else has an overall negative effect on XML's goals. My opinion is of the latter, hence my opposition to the "Namespaces in XML" recommendation and its inclusion in any other W3C specs or related internet standards such as CORBA (yah I have already stated that I personally don't care much about CORBA these days, so I am only using CORBA as an example). Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From shecter at darmstadt.gmd.de Mon Feb 8 10:29:30 1999 From: shecter at darmstadt.gmd.de (Robb Shecter) Date: Mon Jun 7 17:08:49 2004 Subject: Component Markup Language References: <A26F84C9D8EDD111A102006097C4CD0D054960@SOHOS002> Message-ID: <36BEBA83.259DF589@darmstadt.gmd.de> Mark Birbeck wrote: > I wrote: > > How about this: Have one XSL document per client side > > scripting language. > > We do something similar for browser types, by generating all XSL > documents through an ASP page. In our case, we detect the browser type, > and change the rules in the stylesheet dynamically. Your scenario would > benefit too, because you wouldn't need to send loads of sylesheets to > the client... Hi, Interesting idea. It reminds me of the IBM Alphaworks "XML Enabler" project, which is a framework for doing the same thing: selecting an appropriate XSL stylesheet based on browser type. ( http://www.alphaworks.ibm.com ) This sounds better to me than what you're doing, actually, because it's pure Java/servlets; not Microsoft/ASP dependent. Anyhow, in my scenario, the server would be run by one organzation, and the clients by others. The clients would be responsible for making their own XSL sheets depending on whatever type of platform they have. So, only the XML User Interface / Component description would come over the wire, not the XSL. - Robb xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From shecter at darmstadt.gmd.de Mon Feb 8 10:34:13 1999 From: shecter at darmstadt.gmd.de (Robb Shecter) Date: Mon Jun 7 17:08:49 2004 Subject: Component Markup Language References: <0bec01be526d$42222740$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: <36BEBD56.1D80AA73@darmstadt.gmd.de> Just an update: Thanks to everyone for all the pointers to things to check in my search for a UI / Component markup language. By the way, when I'm done doing my survey, I can post a summary here if there's interest. So far, the item that's closest to what I've been looking for is: http://www.pierlou.com/prototype ...This is really an interesting system. The XML UI description seems abstracted enough to be able to generate implementations in various languages, but close enough to Java to make mapping to it easy. The only part that's not there (for my requirements) is something like XSL stylesheets for different client-side scripting languages (only a hard-coded XML->Java conversion is supported at this time). But that wouldn't be too big of a deal. - Robb xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From busterdeniro at hotmail.com Mon Feb 8 10:37:39 1999 From: busterdeniro at hotmail.com (Buster Blues) Date: Mon Jun 7 17:08:49 2004 Subject: Little problem with namespace Message-ID: <19990208103609.2409.qmail@hotmail.com> Hi, I would want some informations on namespaces. I want to put my xml sheet in my browser (IE5). I have specified a style sheet, but if I remove this line : http://www.w3.org/TR/WD-xsl, an error occurs. Is it necessary to put a reference on the W3c web page and how to make differently?? Thanks Pascal ______________________________________________________ Get Your Private, Free Email at http://www.hotmail.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From shecter at darmstadt.gmd.de Mon Feb 8 10:40:48 1999 From: shecter at darmstadt.gmd.de (Robb Shecter) Date: Mon Jun 7 17:08:49 2004 Subject: CORBA's not boring yet. / XML in an OS? References: <4EB4281B.662222A5@darmstadt.gmd.de> <36B9FA0C.11FBFBC@infinet.com> Message-ID: <36BEBEA7.DDB98BFB@darmstadt.gmd.de> Tyler Baker wrote: > I wrote: > >...this month's Linux Journal - > > it describes how -both- up and coming desktop environments are basing > > major parts of their architectures on CORBA. KDE's so cool it makes me > > want to learn C++. :) > > > Prediction: In 3 years, half the people on this list will be using a > > corba-based desktop environment. > > Not likely. My biggest problem with CORBA was that it was too huge for the client and > consumed too many resources. Actually, I was implying that half of us will be using Linux, and therefore Corba, because it's now in the desktop environments. I -was- exagerating, but I think it's not extreme to predict that in 2-3 years Linux (with KDE or GNOME) will have broken out of the hacker-only world, and onto the desktop. > > > > Anyhow, this naturally makes me wonder - could XML and related ideas > > like XSL have a place in an operating system? Where would they fit in? > > KDE and Gnome could be great playgrounds for trying something like this > > out. > > They already do if you consider Internet Explorer a fundamental part of the Windows Operating > System (-: > Ouch. :) - Robb xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From shecter at darmstadt.gmd.de Mon Feb 8 10:51:23 1999 From: shecter at darmstadt.gmd.de (Robb Shecter) Date: Mon Jun 7 17:08:49 2004 Subject: CORBA's not boring yet. / XML in an OS? References: <0d2701be5270$824dbca0$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: <36BEC149.AB389419@darmstadt.gmd.de> James Tauber wrote: > >Anyhow, this naturally makes me wonder - could XML and related ideas > >like XSL have a place in an operating system? Where would they fit in? > >KDE and Gnome could be great playgrounds for trying something like this > >out. > > For a while now, I've been thinking what an OS (or more likely shell) would > look like if it took Unix's "everything as a file" to "everything as an XML > element". Now this is interesting. > > > A system would be a single XML "uberdocument"... Applications ... would operate on other > nodes in the element tree. > > There would be an application, for example, that got mail via POP or IMAP, > represented it in XML and then attached it a particular point in the > uberdocument. XSL could be used to sort the mail. XSL would also be used to > view the mail. > Great ideas. I can see that this would just follow the unix philosophy, and could actually be useful: Just like how today, anyone can use the unix command line tools to pipe together small apps to form "new" app/filters, in this XML/OS, someone could use XML/XSL parsers/apps to connect and filter XML to create new apps and filters. The myriad of programs that operate and manipulate XML could manipulate any OS object, program or data. For example, the IBM Alphaworks "Tree Diff" (a Java program that generates "diff" info between two XML documents) can be applied to anything stored in the OS, in the same way that the conventional diff can operate on any text file. > > It's XML for the sake of it, but I think it would be fun to try out. > Absolutely, but I'd also bet that some convincing arguments could be made for real advantages of this. - Robb xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Mon Feb 8 11:48:15 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:08:49 2004 Subject: "Clean Specs" Message-ID: <01BE5360.62756860@grappa.ito.tu-darmstadt.de> Tim Bray wrote: > And as regards the namespace spec, I think that some people on this > list are substantially full of shit, and are wilfully refusing to see > how simple it is because it does not meet their own design prejudices. > I think that spec is *way* better than the XML spec. I've quoted this out of order because I think it is a very important point and one that has bothered me during this whole discussion. Tim is absolutely correct here -- the namespaces spec is *way* better than the XML spec. There is absolutely no comparison, both from an organization and a writing (sentence-level) point of view. I think there are three other things to remember with respect to the namespaces spec. First, no matter how you cut them, namespaces turned out to be far more difficult than any of us could have imagined. We didn't, as is the case in most programming languages, simply get them for free based on where we declared our variables -- we had to decorate variables ourselves. Thus, I think the discussion has often confused frustration about technology with frustration about spec writing. Second, watching a spec evolve is not always the best way to judge how good it is. Throughout this discussion, I have wondered how I would have felt about the namespaces spec had I seen it for the first time in its completed form. No doubt I would have had some confusion, but I also would not have been carrying pre-conceived notions from one version to the next, which is where a lot of my confusion about the relation between attributes and namespaces came from. Finally, we have to remember that namespaces appear to work -- I have yet to hear any examples on this list where they don't. Until such examples arise, I think we have more to gain by agreeing to use namespaces than by going off in our own directions. > This group is notably and vocally dissatisfied with the specs, I > am watching with attention for concrete suggestions as to how > to make future specs better - the one premise that seems to get > consensus, in this group at least, is "more examples". (Hmm, the > namespace spec has tons). OK. Enough being nice. Here's the brickbats :) 1) One idea, one term. The namespaces spec uses three terms (XML namespace, traditional namespace, and namespace) for two ideas -- XML and traditional namespaces. This can lead to confusion, as mail between Mark Birbeck and me shows -- he interprets standalone "namespace" (for example, see the definition of "declared") to mean traditional namespace while I interpret it to mean XML namespace. (Interestingly, the spec can be read using either definition, but it does lead to different conclusions.) A more egregious example is the use of "entity" to mean "general entity" in the XML spec -- this might work for the initiated, but is very confusing for first-time readers. 2) Include negative truths as well as positive truths. In the namespaces spec, an example would be to state that unprefixed attributes are not in any (XML) namespace. Although this can be determined from the fact that there are no statements saying that they *are* in a namespace, it is much easier for the spec to say this than for the reader to work it out themself. The reason this is important is that the definition of an XML namespaces sets the readers expectations by stating that they include attribute names. As another example of negative truths, non-goals are often as useful as goals in setting the reader's expectations. 3) Be explicit about goals. Although the namespaces spec describes its goals in general form in the motivation section, I think spelling these out might have helped a lot. In particular, it would have saved us from having to write "Unique names. Really. Just unique names. Nothing more. Really." in email over and over and over. 4) Organization counts. Organization it is often harder than the writing itself and many writing problems stem from organizational problems. The namespaces spec is very good in this respect. The XML spec is a nightmare. Some easy examples: a) The definition of a document is tucked away in a section on well-formedness. b) DTDs are introduced in the same section as prologs and version numbers, but otherwise spread across the spec, including such things as putting Conditional Sections in section 3 (Logical Structures). c) Section 2.8 tells us that attribute-list declarations in the internal subset take precedence over those in the external subset. This information either belongs in, should be repeated, or should be referenced from the section on attribute list declarations, but is not. 5) Headings matter. They set the reader's expectation for what is to come, and if the section doesn't answer the questions raised in the reader's mind by the heading, the result is confusion even if the section is well-written. Except for Appendix A, the namespace spec does a good job here. My favorite mis-leading heading in the XML spec is "Documents" (section 2), which would have better been titled "Miscellany". 6) Include cross references. Again, the namespaces spec is very good and the XML spec is a nightmare -- trying to find an EBNF definition and the corresponding text when it is not in front of your face is almost impossible without hypertext. 7) As appropriate, include motivation. Admittedly, this is thinner ice, but it can be useful. For example, the motivation for namespace defaults is presumably to reduce the number of prefixed elements. Including this information gives the user something they can immediately grab on to. Similarly, the motivation for allowing namespace declarations on any element is presumably the ability to assemble a document from fragments, as well as the ability to redeclare defaults. Stating this motivation deflects the reader from wondering why on earth anybody would want to keep changing their prefixes. 8) Restate when necessary. This is more thin ice. Succinct statements are useful, but often overly loaded with meaning and difficult to interpret. Maybe I'm just dense, but when I read the statement, "...default namespaces do not apply to directly to attributes", my first reaction was "Huh?" This statement includes two important concepts, neither of which was immediately obvious to me. The first is that: a) The lack of a prefix on an element means it is in the default namespace, if there is one. b) Attributes can also lack a prefix. c) Because attributes can lack a prefix, we are explicitly stating that they behave differently from elements and do not automatically belong in a namespace. The second is that: a) Elements can be in an XML namespace. b) According to the XML spec, elements implicitly have a traditional namespace for their attribute names. c) Thus, defaults apply to attributes indirectly in that they give the (XML) namespace of the element containing the attribute (traditional) namespace. Thus, although the statement is concise and correct, a clarifying sentence or three that restated the same thing would have been very helpful. 9) Emphasize the important stuff; cover the obscure stuff. A spec has an obligation to be thorough, so it must cover everything, no matter how infrequently something is used. However, how it is organized and how much is devoted to each topic strongly influences the reader's perspective. As an example of misleading organization, the XML spec discusses such things as comments, PIs, white space, and CDATA sections before it ever tells us what an element looks like. This gives the impression that these are more important. To the average reader, XML is about elements and attributes and declaring them; white space can be forward-referenced and the rest can be relegated to a later part of the spec without harm. Similarly, the namespace spec could tell us how to use prefixes before telling us how to declare them. This motivates the reader to see the usefulness of namespaces and makes them think, "Cool! Now how do I declare these prefix thingies?" Quantity devoted to a subject can also be misleading. For example, the last example in section 5.2 is the largest in the namespace spec. As a reader, this leads me to believe it is very important, when in fact it covers a seldom-used case. Section 2.12 (Language Identification) is similarly misleading, although the authors in that case might have had good reason to believe it would be more widely used than it has. > Having said all that, people who write specs always have to try to > do a better job next time, so this recent discourse is very very useful. Thanks for listening. I hope this has been helpful. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Mon Feb 8 12:06:14 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:08:49 2004 Subject: "Clean Specs" Message-ID: <01BE5362.E9115440@grappa.ito.tu-darmstadt.de> Don Park wrote: > I do realize that the XML spec is leaky in certain respect but I felt it is > very clear as a whole although I am unable to point out exactly what makes > it so. If I may venture an opinion, the XML spec is sloppily written, informal, and poorly organized. The namespaces spec is tightly written, well-organized, and relatively formal. I believe it is the informality of the XML spec that makes it clear as a whole -- it certainly isn't the organization. For example, at the end of section 2.8, we are told that an internal DTD is interpreted before an external DTD and that one consequence of this is that attributes in the internal DTD override those in the external DTD. The spec doesn't need to tell us this consequence -- it can be determined from the statements that internal DTDs are interpreted first and that the first attribute declaration wins. However, by telling us this and similar things, we are saved a lot of hard thinking and confusion and therefore are happier with the spec and understand it better. The namespaces spec does this less often, but when it does, feels very approachable. A good example of this is the statement in section 4 about operational difficulties when default attributes are declared in external DTDs. The spec doesn't need to tell us this, but saves us a lot of time and pain when it does. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Mon Feb 8 12:15:45 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:08:50 2004 Subject: When to use attributes vs. elements In-Reply-To: <5BF896CAFE8DD111812400805F1991F708AAEF3C@RED-MSG-08> Message-ID: <Pine.GHP.4.02A.9902081131000.27539-100000@mail.ilrt.bris.ac.uk> On Fri, 5 Feb 1999, Andrew Layman wrote: > Thank you. Dan asks a reasonable question, which is whether a document that > uses the conventions described in > http://www.w3.org/TandS/QL/QL98/pp/microsoft-serializing.html needs to > signal somehow that these conventions are in play. > > In case of the "canonical format" I proposed, however, I don't think special > signalling is necessary: The proposal does not add any new interpretations > to the use of elements or attributes beyond what can be described in a DTD > or a schema such as XML-Data or DCD. Elements, attributes, ids and idrefs > are carefully used so that their normal XML interpretation matches the > scoping and linking rules of object graphs or relational databases. So, to be clear on what you're claiming... For any chunk of 'normal' XML, you have a set of interpretation rules that tell us how all the attributes and elements map into "graphs of data such as database tables and relations, nodes and edges from directed labeled graphs, and similar constructions"[1]. This would be enormously useful, if people could be persuaded it were true. > In a general case, if conventions add rules for interpretation above what is > in the structure of a document or above what can be expressed in a DTD, then > this would need to be somehow signalled in order for a reader to process the > document. I'm a little confused in that [1] proposes a canonical framework for interpreting all XML as graph serialisations, but then goes on to discuss "Mapping Abbreviated Syntax to Canonical Syntax": However, the canonical syntax is not the only syntax that could be used to serialize a graph. In many cases, alternative syntaxes may be used, either due to historical or political factors, or to take advantage of compressions that are available if one has domain knowledge. We call all of these "abbreviated syntaxes."[1] This implies that some unknown subset of XML instance data will have been serialised according to one or more alternate serialisation algorithms. Consequently de-serialising such data according to the 'canonical' algorithm will garble your data. In which case we're back in a situation where we need a mechanisms such as <XYZ:SerializationAccordingToAndrew> to tell us which data can be interpreted according to the 'canonical' rules versus some alternate (possibly unknown) serialisation rules. The example alternate serialisation given is: <Class> <name>Western Civilization</name> <taughtBy>Thorsten</taughtBy> <attendedBy>Raphael</attendedBy> <attendedBy>Smith</attendedBy> </Class> Interpreting this according to the "Procedure for XML Instance to Graph Conversion" rule will give garbage data. We simply don't know from looking at the XML above what nodes and edges it creates. The fact that we need to treat such data in a special manner is worrying: how are we supposed to _know_ when there is something else to know? (repeated from above) > In a general case, if conventions add rules for interpretation above what is > in the structure of a document or above what can be expressed in a DTD, then > this would need to be somehow signalled in order for a reader to process the > document. This suggests that the burden is placed upon content creators to flag up when the generic 'canonical' rule wouldn't usefully apply to the interpretation of the XML content. So the default behaviour would be to assume everyone used the rules outlined in [1] unless associated schema, stylesheet or enclosing tags told us otherwise? So... if I'm a 'canonical-format' aware processor building a graph from XML data acquired from a variety of sources, what procedure do I follow to sort XML instance data into the follow categories: a) old XML files which *happen* to have been serialised according to the canonical-format rules b) old XML files which happen *not* to have been serialised according to the canonical-format rules. (for example, the extract above) c) recent XML files created by following the c-f rules for serialising graphs d) recent XML files created using an alternative or abbreviated graph serialisation algorithm as discussed in [1] In particular, I'm concerned that (a) and (b) are mechanically indistinguishable. Dan [1] http://www.w3.org/TandS/QL/QL98/pp/microsoft-serializing.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jayadeva at lgsi.co.in Mon Feb 8 12:51:08 1999 From: jayadeva at lgsi.co.in (Jayadeva Babu Gali) Date: Mon Jun 7 17:08:50 2004 Subject: problem Message-ID: <36BEDDED.7A9516BA@lgsi.co.in> hi, how can i call the image (.gif) from XML file through xsl i have written the like below but xml file whe i called from MSIEBeta5 it is not displaying the images. All files is in the same directory with gif's also. /****** xml file *****/ <?xml version="1.0"?> <?xml:stylesheet type="text/xsl" href="test.xsl"?> <items> <item> <param>picture1</param> <picture>green-ball.gif</picture> </item> <item> <param>picture2</param> <picture>yellow-ball.gif</picture> </item> </items> /************** xsl file ***********/ <?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl" xmlns="http://www.w3.org/TR/REC-html40" result-ns=""> <xsl:template match="/"> <table> <xsl:for-each select="items/item"> <tr> <td> <xsl:value-of select="param"/> </td> <td> <img> <xsl:attribute> <xsl:value-of select="picture"/> </xsl:attribute> </img> </td> </tr> </xsl:for-each> </table> </xsl:template> </xsl:stylesheet> regds.....jayadev xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Mon Feb 8 13:03:05 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:08:50 2004 Subject: document vs non-document entity (was Re: CORBA's not boring yet. / XML in an OS?) References: <002101be52be$e92480a0$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: <36BEE1B6.8108B0C9@mecomnet.de> This raises the question: "What is the extent of an entity binding?" Assume that is possible to qualify entity names. Since these qualifications are already lexically scoped, why would one need to introduce local entity definitions? The effective entity is already determined by a binding with a lexical scope. One reason could be that the form actually makes explicit a dynamic extent for "local" entity definitions and is set in contrast to an indefinite extent for "global" definitions within the dtd (or elsewhere). I have gathered from discussions, that contemporary implementation techniques already enforce a dynamic extent for entity definitions - they "live" exactly as long as the document. Unless this changes, then local entities are not strictly necessary. With an XML/OS there may well be cause to change the binding semantics. James Tauber wrote: > > Paul Prescod: > >This is a serious problem inherited from SGML. It is my opinion that > >instead of namespace declarations being done through attributes they > >should have been just another form of PI-based declaration. Then in > >version 2.0 of XML, all declarations should have been made "localizable." I don't understand the connection here. Where is the dependancy? > >In particular, there should be a way to declare internal text entities, > >unparsed entities and attribute defaults locally. > > Well this is actually where my thinking has been going. I've been thinking > about whether localizable declarations might be achievable using an > *attribute* mechanism following on from how namespaces ended up. > > ... > > The xmldecl attribute takes the value of the URI of an external parameter > entity. > > To avoid name clashes, it might be an idea to have > xmldecl:foo="local-decl.pen" and then qualify entities with the prefix foo > (ie &foo:SomeEntity;) While I agree that this will ultimately prove necessary, I suggest that qualified entity names render local entity bindings redundant. It is equivalent to mapping each of the definition names into a distinct namespace at the point of definition and using lexically bound prefixes to map the reference name accordingly. Opps! Yea, I forgot, there is no way to bind a prefix in the DTD (ie "the point of definition"). Got to wait for schemas for this. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Mon Feb 8 13:28:21 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:08:50 2004 Subject: "Clean Specs" References: <3.0.32.19990207171050.00c2c870@pop.intergate.bc.ca> Message-ID: <36BEE79E.D85840DF@mecomnet.de> Tim Bray wrote: > ... > > I gotta say, though, the XML spec now feels to me like a ramshackle > compromise that only just barely works, while namespaces do one simple > thing and nail it down tight as a drum. Here's how bad it is; I'm > working on an Annotated namespaces, just like annotated XML - and I'm > having serious difficulty figuring out what to write. -Tim > Please correct or rewrite the equivalent of Appendix A so that neither you, nor the spec, nor other prominent contributors to this forum feel that it is necessary to disavow it. Use the definitions/formalism it then contains to describe the examples in the spec. While this order of exposition is not ideal, the prose in the recommendation has "gotten there first". Introduce additional examples to document the combinations ((default or overridden bindings) X (prefixed or unprefixed element names) X (prefixed or unprefixed attributes names)) which are missing. For this reader, Appendix A would have been the most useful thing to support an implementation. While I have come to accept that one should not expect a definition of a similar form (that is, something at least approximating a denotational definition) for XML proper, it is unfortunate that the spec for namespaces does not include a formal definition. And that the semi-formal description is relegated to "non-normative" status. The difference between XML proper and "Namespaces in XML" is that while XMLdescribes a code, for which the BNF suffices, the namespace spec deals exactly with the relation between the encoded representation and another domain. If there were a formal description of this relation, then it would easier to understand, easier to check for completeness and correctness and, ultimately, easier to produce a conforming implementation. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Mon Feb 8 13:53:38 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:08:50 2004 Subject: Announcement: 4DOM 0.7.0 Message-ID: <199902081355.GAA02899@malatesta.local> FourThought LLC (http://FourThought.com) announces the release of 4DOM 0.7.0 ----------------------- A CORBA-aware implementation of the W3C's Document Object Model in Python 4DOM is a close implementation of the DOM, including DOM Core level 1, DOM HTML level 1, Node Iterator and Node Filter from DOM Level 2, and a few utility and helper components. 4DOM was designed from the start to work in a CORBA environment, although an orb is no longer required. For using 4DOM in an ORB environment, Fnorb and ILU are supported. 4DOM is designed to allow developers rapidly design applications that read, write or manipulate HTML and XML. New in 4DOM 0.7.0 ----------------- - Added support for "orbless" configuration. Now neither ILU nor Fnorb are requred and 4DOM can be run purely locally, but still with a consist ent interface. Naturally, the orbless config is much faster than the ilu or fnorb configs. - Many fixes to improve consistency over an ORB interface (an example using an ORB has been added to demos). - Fixes to NodeList and NamedNodeMap - Added an Ext package for DOM extensions, and moved many of the existing extensions there. See docs/Extensions.html. - Added to Ext an extensive factory interface for creation of nodes, consistent for local and ORB use. - Added to Ext a ReleaseNode helper function to reclaim unused nodes, necessary for ORB usage, and also for local usage because of circular references. - Added NodeIterators and Node Filters from DOM Level 2 - Added a visitor and walker system (to Ext). These generalize the NodeIterator concept for cases where pre-order traversal is not suitable: for instance printing. - Removed the repr functions from Node interfaces in favor of print walker/visitors. - Added Print and PrettyPrint helper functions to Ext for printing and pretty-printing node trees. - Added Strip helper function to Ext to strip all ignorable white-space text nodes from a node tree. - Moved all tools to construct a DOM tree from XML and HTML text to a Builder module in Ext, with two functions: FromXML and FromHTML. - Added options to FromXML that allow specification of whether to keep ignorable whitespce int he resultant node tree, and options on whether to validate. - Innumerable minor and miscellaneous fixes But what about PyDOM? --------------------- Please note that the XML-SIG is working on a separate DOM implementation, and there is currently discussion regarding the relative roles of 4DOM and PyDOM. PyDOM follows a more Python-like interface, returning a dictionary of nodes, for instance, where the DOM spec specifies an object with NamedNodeMap interface. This was a deliberate choice for the convenience of Python programmers. Also, PyDOM can build and write HTML, but only supports HTML nodes through the DOM core interface. 4DOM strictly follows the DOM interface specs and supports all HTML element capabilities. However, 4DOM is a bit more heavyweight. More info and Obtaining 4DOM ---------------------------- Please see http://OpenTechnology.org/projects/4DOM 4DOM is distributed under the terms of the GNU Library Public License (LGPL). http://www.gnu.org/copyleft/lgpl.html -- Uche Ogbuji uche.ogbuji@fourthought.com Consulting Member, FourThought LLC http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Feb 8 14:03:35 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:50 2004 Subject: "Clean Specs" In-Reply-To: <36BEE79E.D85840DF@mecomnet.de> References: <3.0.32.19990207171050.00c2c870@pop.intergate.bc.ca> <36BEE79E.D85840DF@mecomnet.de> Message-ID: <14014.60860.129927.264822@localhost.localdomain> james anderson writes: > The difference between XML proper and "Namespaces in XML" is that > while XMLdescribes a code, for which the BNF suffices, the > namespace spec deals exactly with the relation between the encoded > representation and another domain. If there were a formal > description of this relation, then it would easier to understand, > easier to check for completeness and correctness and, ultimately, > easier to produce a conforming implementation. I'm not certain that this is exactly right. "Namespaces in XML" describes a naming scheme that can enable a relation between the encoded representation and another domain, but it specifies neither the relation nor the other domain. I'm going to try a very simple exposition in a separate message. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From skshirsa at nortelnetworks.com Mon Feb 8 14:25:25 1999 From: skshirsa at nortelnetworks.com (Shekhar Kshirsagar) Date: Mon Jun 7 17:08:50 2004 Subject: problem Message-ID: <3.0.32.19990208092032.006a147c@bl-mail2.corpeast.baynetworks.com> Hi, In your XSL style sheet attribute name (SRC) was missing. Here is the correct XSL rule : <img> <xsl:attribute name="SRC"> <xsl:value-of select="picture"/> </xsl:attribute> </img> Thanks & Regards, Shekhar Kshirsagar Nortel Networks. At 06:21 PM 2/8/99 +0530, Jayadeva Babu Gali wrote: >hi, > >how can i call the image (.gif) from XML file through xsl i have >written the like below but xml file whe i called from MSIEBeta5 it is >not displaying the images. All files is in the same directory with gif's >also. > >/****** xml file *****/ ><?xml version="1.0"?> ><?xml:stylesheet type="text/xsl" href="test.xsl"?> ><items> > > <item> > <param>picture1</param> > <picture>green-ball.gif</picture> > </item> > > <item> > <param>picture2</param> > <picture>yellow-ball.gif</picture> > </item> > ></items> > > >/************** xsl file ***********/ ><?xml version="1.0"?> ><xsl:stylesheet > xmlns:xsl="http://www.w3.org/TR/WD-xsl" > xmlns="http://www.w3.org/TR/REC-html40" > result-ns=""> ><xsl:template match="/"> ><table> ><xsl:for-each select="items/item"> > <tr> > <td> > <xsl:value-of select="param"/> > </td> > <td> > <img> > <xsl:attribute> <xsl:value-of select="picture"/> ></xsl:attribute> > </img> > </td> > </tr> ></xsl:for-each> ></table> ></xsl:template> ></xsl:stylesheet> > > >regds.....jayadev > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Mon Feb 8 14:28:25 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:08:50 2004 Subject: CORBA's not boring yet. / XML in an OS? Message-ID: <005b01be536f$869eec70$5ff96d8c@NT.JELLIFFE.COM.AU> From: Robb Shecter <shecter@darmstadt.gmd.de> >James Tauber wrote: >> For a while now, I've been thinking what an OS (or more likely shell) would >> look like if it took Unix's "everything as a file" to "everything as an XML >> element". > >Now this is interesting. This is not so-far fetched. The idea for the XML Encoding PI comes from a University of Hong Kong (or was it the Chinese University of HK) project called HANZIX: a version of UNIX which would accept Chinese text streams in multiple encodings. They came up with the idea "Codeset Announcement"; XML uses an (improved) version of this. I suppose the sgrep tool is an example too: but do you really want to be parsing and serializing XML, unless it is very large text with localized processing? Maybe it would be better to generalize the pipe mechanism so that it connects either text or a DOM object too. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Feb 8 14:40:57 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:50 2004 Subject: 19 Short Questions about Namespaces (with Answers) Message-ID: <14014.60982.658099.349181@localhost.localdomain> 19 SHORT QUESTIONS ABOUT NAMESPACES (WITH ANSWERS) by David Megginson Monday 8 February 1999 BACKGROUND ---------- For the full specification of XML 1.0, see [1]; for the full specification of Namespaces in XML, see [2]. This brief review uses James Clark's notation for writing names that contain both a URI part and a local part. For example, if the URI part of a name were "http://www.foo.com/" and the local part were "a", the name would be written {http://www.foo.com/}a This is purely a convenience notation for the sake of documentation; it is not defined by any known specification, and is unlikely to be recognised by any processor. CHAPTER ONE: The XML 1.0 Perspective ------------------------------------ [Example] <a b="x" c="y"/> [Q] What is the name of the element in the example above? [A] The name is "a". [Q] What is the name of the first attribute in the example above? [A] The name is "b". [Q] What is the name of the second attribute in the example above? [A] The name is "c". [Q] What do the names mean? [A] The application determines the meaning of the names. [Q] How do you write a DTD declaration describing the structure of this element? [A] <!ELEMENT a EMPTY> <!ATTLIST a b CDATA #IMPLIED c CDATA #IMPLIED> CHAPTER TWO. The Namespaces Perspective --------------------------------------- [Example 2a] <z:a z:b="x" c="y" xmlns:z="http://www.foo.com/"/> [Q] What is the name of the element in the example above? [A] The name is "z:a" from the XML 1.0 perspective, or "{http://www.foo.com/}a" from the Namespaces perspective. [Q] What is the name of the first attribute in the example above? [A] The name is "z:b" from the XML 1.0 perspective, or "{http://www.foo.com/}b" from the Namespaces perspective. [Q] What is the name of the second attribute in the example above? [A] The name is "c" from both the XML 1.0 and the Namespaces perspectives. [Q] What is the name of the third attribute in the example above? [A] The name is "xmlns:z" from the XML 1.0 perspective; from the Namespaces perspective, this attribute is a declaration. [Q] What do the names mean? [A] The application determines the meaning of the names. [Q] What does the namespace URI "http://www.foo.com/" mean? [A] It has no defined meaning. [Q] How do you write a DTD declaration describing the structure of this element? [A] DTDs use the XML 1.0 perspective: <!ELEMENT z:a EMPTY> <!ATTLIST z:a z:b CDATA #IMPLIED c CDATA #IMPLIED xmlns:z CDATA #FIXED "http://www.foo.com"> [Example 2b] <a b="x" c="y" xmlns="http://www.foo.com/"/> [Q] What is the name of the element in the example above? [A] The name is "a" from the XML 1.0 perspective, or {http://www.foo.com/}a from the Namespaces perspective. [Q] What is the name of the first attribute in the example above? [A] The name is "b" from both the XML 1.0 and the Namespaces perspectives. [Q] What is the name of the second attribute in the example above? [A] The name is "c" from both the XML 1.0 and the Namespaces perspectives. [Q] What is the name of the third attribute in the example above? [A] The name is "xmlns:z" from the XML 1.0 perspective; from the Namespaces perspective, this attribute is a declaration. [Q] What do the names mean? [A] The application determines the meaning of the names. [Q] What does the namespace URI "http://www.foo.com/" mean? [A] It has no defined meaning. [Q] How do you write a DTD declaration describing the structure of this element? [A] DTDs use the XML 1.0 perspective: <!ELEMENT a EMPTY> <!ATTLIST a b CDATA #IMPLIED c CDATA #IMPLIED xmlns CDATA #FIXED "http://www.foo.com"> REFERENCES ---------- [1] http://www.w3.org/TR/REC-xml [2] http://www.w3.org/TR/REC-xml-names All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simpson at polaris.net Mon Feb 8 15:20:51 1999 From: simpson at polaris.net (John E. Simpson) Date: Mon Jun 7 17:08:50 2004 Subject: Clear specs: suggestions Message-ID: <3.0.32.19990208101713.007bf670@polaris.net> It's been educational, not to say entertaining, to have read the firestorm of opinion over the last week. (Ron Bourret's recent posts on suggestions for clarifying the Namespaces in XML spec, and contrasting it with XML 1.0's, was especially well-done and much appreciated.) I thought of a couple things the W3C might do to help ensure the quality of standards *documents* (vs. the quality of the standards-behind-the-documents). Both of these suggestions would apply only to documents at the WD or later stage -- NOTEs would be exempt. (1) Templates (or at least, guidelines): This seems so obvious that I can't believe it's not already being done; a few comments in the last week imply, however, that each WG sort of goes about the preparation of the document with its own ideas, more or less, of what a spec should look like and how much detail it should contain. I'm thinking here of DTDs for WDs, PRs, and RECs. The content model for the WD level might contain elements like "openissue" that would be absent from the latter two levels -- or relegated to "for future consideration" appendices. I really liked the approach suggested by Ron (I think; apologies if I'm either misattributing the idea or misrepresenting Ron) -- formalism, narrative, examples. That suggests the main content model for each topic and sub-topic in a spec. (2) Establish editorial-review committees at least at the level of the W3C's four "domains": User Interface, Technology & Society, Architecture, and the Web Accessibility Initiative. (Depending on resource requirements and availability, this might better be pushed down to the level of individual activities within those domains.) Because these committees would be familiar with the broad issues as well as, perhaps, some of the details, but not involved at the nitty-gritty level of thinking of the spec writers, I'd think they'd be good stand-ins for the target audience(s). In order not to bog down the drafting process, a given spec's editorial review might be required no sooner than the transition to PR, but definitely before becoming a REC. My apologies if I'm speaking out of turn here. I'm not a member (nor is my employer) of the W3C; these just seemed to be two reasonable, non-onerous approaches to ensuring clarity and consistency in published specs. Best, JES ============================================================= John E. Simpson | It's no disgrace t'be poor, simpson@polaris.net | but it might as well be. | -- "Kin" Hubbard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From RDaniel at DATAFUSION.net Mon Feb 8 15:37:22 1999 From: RDaniel at DATAFUSION.net (Ron Daniel) Date: Mon Jun 7 17:08:50 2004 Subject: Colonialism, SAX, Java, and Namespaces Message-ID: <0D611E39F997D0119F9100A0C931315C4123B5@datafusionnt1> Tyler Baker says: > The > fact that no one is using this supplement to XML called "Namespaces in > XML" ... I'm sorry Tyler, but that is just not true. I, for one, am using namespaces in a commercial product. Dave Megginson has stated several times on this list that he is using namespaces in things he is selling his customers. Specifications that are getting ready to come out, such as RDF and XSL, are using them. This is because they solve a particular problem - groups can work independently on their specifications without having to establish a centralized registry of element and attribute names. You will see more use of namespaces over time, especially as groups define sets of elements with the intent that people take and reuse them. The basic ideas of the namespaces spec are really simple: 1) In order to prevent name collisions we need a way to uniquify the names, associating a URI with them is one way that fits in well with the web ethos. 2) Simply concatenating the URI and the name to be qualified would be one way to do it, but the names would be very long and they wouldn't be legal XML 1.0. 3) The xmlns: attribute lets us define abbreviations for the URIs (the prefixes) so now we can get pretty short unique names that will be legal. Things get a bit more involved with the scoping and defaulting rules, but not that much. Regards, Ron Daniel Jr. DATAFUSION, Inc. 139 Townsend Street, Ste. 100 San Francisco, CA 94107 415.222.0100 fax 415.222.0150 rdaniel@datafusion.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Feb 8 15:42:20 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:50 2004 Subject: [corrected] 19 Short Questions about Namespaces (with Answers) In-Reply-To: <14014.60982.658099.349181@localhost.localdomain> References: <14014.60982.658099.349181@localhost.localdomain> Message-ID: <14015.1272.25782.171309@localhost.localdomain> [The inevitable typo showed up -- thanks to Adam Donahue and John Simpson for pointing it out, and even just for reading far enough to find it.] 19 SHORT QUESTIONS ABOUT NAMESPACES (WITH ANSWERS) by David Megginson Monday 8 February 1999 (v.2) BACKGROUND ---------- For the full specification of XML 1.0, see [1]; for the full specification of Namespaces in XML, see [2]. This brief review uses James Clark's notation for writing names that contain both a URI part and a local part. For example, if the URI part of a name were "http://www.foo.com/" and the local part were "a", the name would be written {http://www.foo.com/}a This is purely a convenience notation for the sake of documentation; it is not defined by any known specification, and is unlikely to be recognised by any processor. CHAPTER ONE: The XML 1.0 Perspective ------------------------------------ [Example] <a b="x" c="y"/> [Q] What is the name of the element in the example above? [A] The name is "a". [Q] What is the name of the first attribute in the example above? [A] The name is "b". [Q] What is the name of the second attribute in the example above? [A] The name is "c". [Q] What do the names mean? [A] The application determines the meaning of the names. [Q] How do you write a DTD declaration describing the structure of this element? [A] <!ELEMENT a EMPTY> <!ATTLIST a b CDATA #IMPLIED c CDATA #IMPLIED> CHAPTER TWO. The Namespaces Perspective --------------------------------------- [Example 2a] <z:a z:b="x" c="y" xmlns:z="http://www.foo.com/"/> [Q] What is the name of the element in the example above? [A] The name is "z:a" from the XML 1.0 perspective, or "{http://www.foo.com/}a" from the Namespaces perspective. [Q] What is the name of the first attribute in the example above? [A] The name is "z:b" from the XML 1.0 perspective, or "{http://www.foo.com/}b" from the Namespaces perspective. [Q] What is the name of the second attribute in the example above? [A] The name is "c" from both the XML 1.0 and the Namespaces perspectives. [Q] What is the name of the third attribute in the example above? [A] The name is "xmlns:z" from the XML 1.0 perspective; from the Namespaces perspective, this attribute is a declaration. [Q] What do the names mean? [A] The application determines the meaning of the names. [Q] What does the namespace URI "http://www.foo.com/" mean? [A] It has no defined meaning. [Q] How do you write a DTD declaration describing the structure of this element? [A] DTDs use the XML 1.0 perspective: <!ELEMENT z:a EMPTY> <!ATTLIST z:a z:b CDATA #IMPLIED c CDATA #IMPLIED xmlns:z CDATA #FIXED "http://www.foo.com"> [Example 2b] <a b="x" c="y" xmlns="http://www.foo.com/"/> [Q] What is the name of the element in the example above? [A] The name is "a" from the XML 1.0 perspective, or {http://www.foo.com/}a from the Namespaces perspective. [Q] What is the name of the first attribute in the example above? [A] The name is "b" from both the XML 1.0 and the Namespaces perspectives. [Q] What is the name of the second attribute in the example above? [A] The name is "c" from both the XML 1.0 and the Namespaces perspectives. [Q] What is the name of the third attribute in the example above? [A] The name is "xmlns" from the XML 1.0 perspective; from the Namespaces perspective, this attribute is a declaration. [Q] What do the names mean? [A] The application determines the meaning of the names. [Q] What does the namespace URI "http://www.foo.com/" mean? [A] It has no defined meaning. [Q] How do you write a DTD declaration describing the structure of this element? [A] DTDs use the XML 1.0 perspective: <!ELEMENT a EMPTY> <!ATTLIST a b CDATA #IMPLIED c CDATA #IMPLIED xmlns CDATA #FIXED "http://www.foo.com"> REFERENCES ---------- [1] http://www.w3.org/TR/REC-xml [2] http://www.w3.org/TR/REC-xml-names All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From JEROME.YUROW at hq.doe.gov Mon Feb 8 15:55:00 1999 From: JEROME.YUROW at hq.doe.gov (JEROME.YUROW@hq.doe.gov) Date: Mon Jun 7 17:08:50 2004 Subject: "Clean Specs" Message-ID: <M4286305985.002.uexi1.1.990208155605Z.CC-MAIL*/O=HQ/PRMD=USDOE/ADMD=ATTMAIL/C=US/@MHS> Message authorized by: : paul@prescod.net_at_INTERNET at X400PO -------------- next part -------------- Paul Prescod wrote: >What is a "clean spec?" >People in this discussion are mixing together a variety of things >that I do not consider the same. >Uche Ogbuji wrote: >> I have worked on teams implementing DOM, XSL and parts of XLL. >>Some users > might grant that we have been "succeeding", but let me >>assure you that if so, >> it is despite the W3C specs, not because of them. DOM, especially >>is >> unforgivably inconsistent, incomplete, and unclear for a >>production-ready > (1.0) specification. >There is not a SINGLE person on this mailing list that would say that >it is right to create specifcations that are inconsistent, incomplete >or unclear. It is beyond doubt that specifications in the XML family, >including XML itself, have these problems. The question is how to >avoid that? >* some people say that what the spec needs is "more English". But >much of >the problem with the namespaces specification comes from ambiguity in >the English. >* some people say that we need "more non-normative text" but once >again, >it is a non-normative appendix that is confusing people. >What almost nobody has suggested is that we need more formal >notation. Now if we look at James Clark's "clarifying" document what >we see is *less* English text and *more* notation. >As David Megginson has pointed out, when a spec. invents a notation >to explain its abstract concepts the spec. becomes temporarily harder >to read because you have to learn the notation before the >specification. This will turn people off. I remember that Algebraic >notation turned me off in my first year math classes. But in the >*long run* it makes life easier for everyone. Implementors have >precise definitions of what the hell they are supposed to implement. >End-users get software they can use. People reading the specification >for their own education and edification will understand it better -- >if they perservere through the task of learning the notation. >I know that W3C spec. writers are under pressure to use more >normative English and less normative notation. This is not a vote for >"clean" specifications. It is a vote for messy, hard-to-implement >ones. Where is Dan Connolly when you need him? Let's face it,folks, people do their best work when they're allowed to do what they do best. With this in mind, I would like to make a modest proposal to future standards writing committees: (1) The job of the people actually writing the standard is to concentrate on precision and elegant simplicity, even if this means using the notation of mathematics or symbolic logic. (2) All standards writing committees must have one or more staff members who are technical writers. The staff members' job is to be "intelligent laymen (lay persons? )", to listen to the proceedings, ask questions to help with their own understanding, and to explain and interpret the standards in operational terms to the rest of us (3) All standards will be published as "annotated standards" with the annotations published in alternating blocks of text, perhaps in a different type font. The object of this proposal is that it allows each group of persons to do what it does best and assures that there will be no lag time between the publication of the standard and the publication(s) of interpretations. ______________________________ Forward Header __________________________________ Subject: "Clean Specs" Author: owner-xml-dev@ic.ac.uk_at_INTERNET at X400PO Date: 2/7/99 10:42 AM What is a "clean spec?" People in this discussion are mixing together a variety of things that I do not consider the same. Uche Ogbuji wrote: > I have worked on teams implementing DOM, XSL and parts of XLL. Some users > might grant that we have been "succeeding", but let me assure you that if so, > it is despite the W3C specs, not because of them. DOM, especially is > unforgivably inconsistent, incomplete, and unclear for a production-ready > (1.0) specification. There is not a SINGLE person on this mailing list that would say that it is right to create specifcations that are inconsistent, incomplete or unclear. It is beyond doubt that specifications in the XML family, including XML itself, have these problems. The question is how to avoid that? * some people say that what the spec needs is "more English". But much of the problem with the namespaces specification comes from ambiguity in the English. * some people say that we need "more non-normative text" but once again, it is a non-normative appendix that is confusing people. What almost nobody has suggested is that we need more formal notation. Now if we look at James Clark's "clarifying" document what we see is *less* English text and *more* notation. As David Megginson has pointed out, when a spec. invents a notation to explain its abstract concepts the spec. becomes temporarily harder to read because you have to learn the notation before the specification. This will turn people off. I remember that Algebraic notation turned me off in my first year math classes. But in the *long run* it makes life easier for everyone. Implementors have precise definitions of what the hell they are supposed to implement. End-users get software they can use. People reading the specification for their own education and edification will understand it better -- if they perservere through the task of learning the notation. I know that W3C spec. writers are under pressure to use more normative English and less normative notation. This is not a vote for "clean" specifications. It is a vote for messy, hard-to-implement ones. Where is Dan Connolly when you need him? Let me point out: the CSS and HTML specifications are easy to read because everyone who wants to read them already understands the basic concepts of web pages and layout. They are concrete implementations of ideas we already understand. The same goes for Java. (and yes, I read the Java specification, but AFTER I already knew Java) I wonder if anyone on this list has ever learned a language that was radically different from what they already knew through the language specification: Scheme, Prolog, APL? Technical writing is damn hard. When you do it right, you must make certain decisions about ordering of concepts and revelations that are the exact opposite of what you do in writing a specification. When you write a spec., you need to present things in the order of fundamental building blocks to high level concepts. In technical writing you will put your students to sleep if you zoom in on details before explaining the general framework. I've told people who want to read the DSSSL specification to start at the back and work forwards. Of course once they become implementors then they read it the opposite way around. Life is hard for implementors. I'd rather be forced to implement the entire suite of XML specifcations over HTML/CSS 2. That isn't a slight against HTML/CSS, it's just an attempt to put things in perspective. Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Mon Feb 8 15:55:16 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:51 2004 Subject: CORBA's not boring yet. / XML in an OS? References: <4EB4281B.662222A5@darmstadt.gmd.de> <36B9FA0C.11FBFBC@infinet.com> <36BEBEA7.DDB98BFB@darmstadt.gmd.de> Message-ID: <36BF0850.D0E15914@infinet.com> Robb Shecter wrote: > Tyler Baker wrote: > > > I wrote: > > > >...this month's Linux Journal - > > > it describes how -both- up and coming desktop environments are basing > > > major parts of their architectures on CORBA. KDE's so cool it makes me > > > want to learn C++. :) > > > > > Prediction: In 3 years, half the people on this list will be using a > > > corba-based desktop environment. > > > > Not likely. My biggest problem with CORBA was that it was too huge for the client and > > consumed too many resources. > > Actually, I was implying that half of us will be using Linux, and therefore Corba, because it's > now in the desktop environments. I -was- exagerating, but I think it's not extreme to predict > that in 2-3 years Linux (with KDE or GNOME) will have broken out of the hacker-only world, and > onto the desktop. After not using Linux since my college days, I bought I copy of RedHat 5.2 last weekend. I would be using Linux already except that the JDK for Linux (Kaffe) has not been up to speed with Java 2 (that I think is mostly SUN's fault). Kaffe is a great VM and the work that TransVirtual is doing with Java has been very impressive. I am not so sure about your latter statement though. That is more a function of marketing than anything. Linux is a lot easier to install and is a lot easier to use, but it still is not what I would consider "dumb" enough for the masses who use Windows 98 and the IMac. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Mon Feb 8 16:01:08 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:08:51 2004 Subject: "Clean Specs" Message-ID: <000001be537c$742ed930$4ff96d8c@NT.JELLIFFE.COM.AU> From: Ronald Bourret <rbourret@ito.tu-darmstadt.de> Tim Bray wrote: >> I think that (namespace) spec is *way* better than the XML spec. I think the XML Spec is pretty good, actually. Tim and the others did a great job. >I've quoted this out of order because I think it is a very important point >and one that has bothered me during this whole discussion. Tim is >absolutely correct here -- the namespaces spec is *way* better than the XML >spec. The first draft-parts of the namespace spec (Appendix A) are lousy. And I think they are incorrect. I am attaching the comment I sent in to the namespace effort (alas too late), in the hope that some people might find it interesting or useful. I would don't want to put it on a public website, because, having had my chance and having had my opinion not taken up, I think it is poor sportsmanship to continue whinging. I pushed hard early on for the PI approach. But I changed my mind for one reason only: the need to support HTML-in-XML and XML-in-HTML. The major application of namespaces may well be embedding things in HTML: the PI option is not realistic for a couple of years. To be honest, I don't think Namespaces would have been acceptable to HTML users with the PI option. The need to support HTML developed as a goal during the namespace discussions, and I consider it the key tradeoff factor. >> This group is notably and vocally dissatisfied with the specs, I >> am watching with attention for concrete suggestions as to how >> to make future specs better - the one premise that seems to get >> consensus, in this group at least, is "more examples I am attaching my comment. Appendix A.2. and A.3 are poor in thought, and close off nice doors that should be kept open. >> Having said all that, people who write specs always have to try to >> do a better job next time, so this recent discourse is very very useful. > >Thanks for listening. I hope this has been helpful. At ISO now, you have to have a user model for who you are writing the spec for. Having a target education and technical background for your readers is a great discipline. Perhaps specs should clearly include at their head a notice stating the intended readers. Rick -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990208/21e122a9/namespacecomment.html From shecter at darmstadt.gmd.de Mon Feb 8 16:11:13 1999 From: shecter at darmstadt.gmd.de (Robb Shecter) Date: Mon Jun 7 17:08:51 2004 Subject: CORBA's not boring yet. / XML in an OS? References: <4EB4281B.662222A5@darmstadt.gmd.de> <36B9FA0C.11FBFBC@infinet.com> <36BEBEA7.DDB98BFB@darmstadt.gmd.de> <36BF0850.D0E15914@infinet.com> Message-ID: <36BF0C18.3B27D6D1@darmstadt.gmd.de> Tyler Baker wrote: > Robb Shecter wrote: > > ... I think it's not extreme to predict > > that in 2-3 years Linux (with KDE or GNOME) will have broken out of the hacker-only world, and > > onto the desktop. > > ... I am not so sure about your latter statement though. That is > more a function of marketing than anything. Linux is a lot easier to install and is a lot easier > to use, but it still is not what I would consider "dumb" enough for the masses who use Windows 98 > and the IMac. > I can see this. I think that the number of "dumb" masses is diminshing, though. There was an -excellent- article a couple of years back in CACM: "The Anti-Mac Interface". The authors' premise was that when the Mac was first made, it was the perfect interface. It was developed for people who had never used a computer before, and who would manage several applications and a few dozen documents. However, that doesn't work anymore for the new generation of computer users: They are growing up with a Nintendo in their hands, and want to use a computer to manipulate dozens of applications, and thousands of documents. In this environment, WYSIWYG breaks down to "what you see is all you get". It essentially sends us back to pre-civilization when, instead of using language, we had to just point with a finger. In this new context, and with KDE getting easier to use, the Linux era will be starting sooner rather than later. I also think that this implies that we shouldn't be afraid to make "complex" solutions. But anyhow, I do share some of your scepticism, and will be -extremely- happy if Linux can go where only Windows and Mac are today. - Robb xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lauren at sqwest.bc.ca Mon Feb 8 16:14:36 1999 From: lauren at sqwest.bc.ca (Lauren Wood) Date: Mon Jun 7 17:08:51 2004 Subject: "Clean Specs" In-Reply-To: <4.1.19990208091225.00bbbca0@steptwo.com.au> References: <36BDAF1D.A908F8EF@prescod.net> Message-ID: <199902081614.IAA02859@sqwest.bc.ca> On 8 Feb 99, at 9:14, James Robertson wrote: > Well, has anyone considered employing real, professional technical > authors to write the specifications? As chair of the DOM WG, I (and I think the editors of the specs) would be overjoyed were someone to volunteer the services of a real, professional technical author who could help in the process of getting good specs out the door. However, as has been pointed out by others on this list, this support is difficult to find, as W3C seldom has these resources available. Lauren xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Mon Feb 8 16:29:02 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:51 2004 Subject: "Clean Specs" References: <01BE5360.62756860@grappa.ito.tu-darmstadt.de> Message-ID: <36BF1055.CBB1D9EF@infinet.com> Ronald Bourret wrote: > Tim Bray wrote: > > > And as regards the namespace spec, I think that some people on this > > list are substantially full of shit, and are wilfully refusing to see > > how simple it is because it does not meet their own design prejudices. > > I think that spec is *way* better than the XML spec. > > I've quoted this out of order because I think it is a very important point > and one that has bothered me during this whole discussion. Tim is > absolutely correct here -- the namespaces spec is *way* better than the XML > spec. There is absolutely no comparison, both from an organization and a > writing (sentence-level) point of view. > > I think there are three other things to remember with respect to the > namespaces spec. First, no matter how you cut them, namespaces turned out > to be far more difficult than any of us could have imagined. We didn't, as > is the case in most programming languages, simply get them for free based > on where we declared our variables -- we had to decorate variables > ourselves. Thus, I think the discussion has often confused frustration > about technology with frustration about spec writing. > > Second, watching a spec evolve is not always the best way to judge how good > it is. Throughout this discussion, I have wondered how I would have felt > about the namespaces spec had I seen it for the first time in its completed > form. No doubt I would have had some confusion, but I also would not have > been carrying pre-conceived notions from one version to the next, which is > where a lot of my confusion about the relation between attributes and > namespaces came from. > > Finally, we have to remember that namespaces appear to work -- I have yet > to hear any examples on this list where they don't. Until such examples > arise, I think we have more to gain by agreeing to use namespaces than by > going off in our own directions. Well, that depends on what you define the word "worK" is. If you define working as being able to run some pregenerated example, then I guess they work. If you define working as being manageable at the application level then "Namespaces in XML" is totally broken. Take for example some of the current DOM implementations, for instance Oracle and SUN's. Both allow you to build a DOM Document from a file using namespaces, but if you want to mutate the DOM tree, you are in a quandary because the node name (which in namespaces parlance is a QName) has no context to resolve the prefix. Furthermore, if you copy a node and insert it somewhere else in the document. The DOM Element and Attr interface would need to have a method such as: void setNodeName(String prefix, String namespace, String localPart) as a hack just to make things barely work. In effect "Namespaces in XML" either makes using the DOM completely useless. In fact, in order for the DOM to be made useful in the presence of "Namespaces in XML" you would have to make a lot of changes that are not backwards compatible with the Level 1 recommendation. This in practical terms would make using the DOM in an XSL Processor pretty much pointless (all XSL Processors I know of other than XT use the DOM as the source tree, and some even use it as the stylesheet tree as well). If you mutate the source tree, then everything is hosed. Beyond the DOM, in application frameworks which have some serialization to XML mechanism for components or whatever, you now will have output with random prefixes which makes XML about as attractive to use as EDI transactions. Just to figure out what the heck you are working with requires hunting the entire document for instances of "xmlns:". Some may shrug this off as "so what" but if someone is using someone else's data and some problem is encountered, manually mapping these prefixes to something tangible like a namespaces is quite a chore. It is almost the same has hunting down memory memory leaks when using languages with pointers. Java does not include pointers for the main purpose of removing memory management from the programmer. "Namespaces in XML" now gives us prefix management or even namespaces management depending on how you look at it. As someone else pointed out earlier, we'll see if anyone "that is end-users and web sites" will ever bother making their lives more difficult by using "Namespaces in XML" for developing web-site content. My odds on bet is everyone will be marching in lock-step to support namespaces, but no one will ever actually use it (kind of like what happened with the "push" and "channels" craze between MS and Netscape in 1996). Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Mon Feb 8 16:32:15 1999 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:08:51 2004 Subject: Clear specs: suggestions Message-ID: <014101be5380$3ceae960$a5addccf@ix.netcom.com> John wrote: <<(2) Establish editorial-review committees at least at the level of the W3C's four "domains":>> I think we need to distinguish between authors, technical editors, and English editors. Usually a committee is a complete disaster for the last group, and produces the kind of compromised documents that we have been complaining of. Frank Frank Boumphrey XML and style sheet info at Http://www.hypermedic.com/style/index.htm Author: - Professional Style Sheets for HTML and XML http://www.wrox.com CoAuthor: XML applications from Wrox Press, www.wrox.com Author: Using XML on the Web (March) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From f.lindgren at upright.se Mon Feb 8 16:55:20 1999 From: f.lindgren at upright.se (Fredrik Lindgren) Date: Mon Jun 7 17:08:51 2004 Subject: CORBA's not boring yet. / XML in an OS? References: <4EB4281B.662222A5@darmstadt.gmd.de> <36B9FA0C.11FBFBC@infinet.com> <36BEBEA7.DDB98BFB@darmstadt.gmd.de> <36BF0850.D0E15914@infinet.com> <36BF0C18.3B27D6D1@darmstadt.gmd.de> Message-ID: <36BF15A9.AD87CEEB@upright.se> Robb Shecter wrote: > [snip] > > I think that the number of "dumb" masses is diminshing, though. There was an -excellent- article a > couple of years back in CACM: "The Anti-Mac Interface". The authors' premise was that when the Mac > was first made, it was the perfect interface. It was developed for people who had never used a > computer before, and who would manage several applications and a few dozen documents. > The article I guess you are refereing to was written by Don Gentner and Jakob Nielsen and can be found at: http://www.acm.org/cacm/AUG96/antimac.htm It's been a while since I last read it, but I remember liking it. /Fredrik Lindgren Upright Engineering AB. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From avirr at LanMinds.Com Mon Feb 8 16:57:46 1999 From: avirr at LanMinds.Com (Avi Rappoport) Date: Mon Jun 7 17:08:51 2004 Subject: "Clean Specs" In-Reply-To: <3.0.5.32.19990207174048.007c69c0@amati.techno.com> References: <4.1.19990208092325.00c4bc30@steptwo.com.au> <3.0.32.19990207124243.00ba7eb0@pop.intergate.bc.ca> Message-ID: <v04104603b2e4c5fd3f49@[207.33.50.55]> Perhaps in this modern world, some of the rather large fees charged by W3C for membership could go towards hiring some technical writers to address this issue. IMNSHO, the amount of time that we've all spent thrashing about with namespaces is an example of intelligence, time and energy that could have been avoided by a standard that addressed some of the issues better. If standards are the way we'll do business (and I'm all for that!) then why not invest in the best possible standards up front? Just because IETF and other traditions made do without, doesn't mean that we should be penny wise and pound foolish now. Clarity is a net gain for W3C members, and for the larger community, as the cost of incompatible implementations is significant. Avi At 5:40 PM -0600 2/7/99, W. Eliot Kimber wrote: > The XML WG was an all-volunteer project, as are most standards efforts. > Those of us who participated did so primarily as a personal commitment, not > as something our employers (those of us who have them) pay us to do. > > Standards development is not a commercial process--there is no budget from > which technical writers might be hired. The W3C only administers, it does > not fund. Same for ISO. Some national bodies do fund some standards > development (BSI, the British Standards Institute), but that funding will > tend to be used to support the technologists developing the standard and > not writers crafting the words. > > So while it's true that most, if not all, specifications could benefit from > professional writers, it usually isn't an option for standards developers. ________________________________________________________________ Avi Rappoport, Search Tools Maven: <mailto:avirr@lanminds.com> Guide to Site Indexing and Local Search Engines: <http://www.searchtools.com> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simpson at polaris.net Mon Feb 8 17:00:54 1999 From: simpson at polaris.net (John E. Simpson) Date: Mon Jun 7 17:08:51 2004 Subject: Clear specs: suggestions Message-ID: <3.0.32.19990208115621.007bf2c0@polaris.net> At 11:29 AM 2/8/99 -0500, Frank Boumphrey wrote: >[I] wrote: > ><<(2) Establish editorial-review committees at least at the level of the >W3C's four "domains":>> >I think we need to distinguish between authors, technical editors, and >English editors. That's a good distinction to make. >Usually a committee is a complete disaster for the last group, and produces >the kind of compromised documents that we have been complaining of. "Committee" really wasn't a good word choice. (Sheesh, where's an editor when I need one...?) What I'm thinking of is actually more like a *pool* of individuals for each of the problem domains/activities who could be drawn upon, one for each spec, to clarify and smooth out the rough-hewn bits in the verbiage. Best, JES ============================================================= John E. Simpson | It's no disgrace t'be poor, simpson@polaris.net | but it might as well be. | -- "Kin" Hubbard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Mon Feb 8 17:24:16 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:51 2004 Subject: Clear specs: Audience References: <014101be5380$3ceae960$a5addccf@ix.netcom.com> Message-ID: <36BF1CC1.5ABA71A8@manhattanproject.com> If it's of any value, I've always seen 4 types of documentation (partitioned by audience): a) Business Process Documentation This type of documentation describes what the business purpose for the system/specification is. This is, essentially, a white paper that introduces the primary ideas in a understandandable way to a general audience. It serves as a good introductory document for newbies. For XML, this type of documentation should describe the goals. What XML is intended to be, and what it is _NOT_ intended to be. b) User Documentation This type of document presents the material in a way that is understandable by a user of the software system / specification. This type of documentation tends to be more of a tutorial guiding the user through actual practice exercises. This documentation usually also has a reference summary. AKA, teach via example. c) System Administration Documentation This type of documentation is aimed at people who administer the usage of the software/spec by a body of people. It is concerned with concurrency, collaboration, maintanence, standards, scaleability, configuration, etc. d) Programmer's Documentation This type of documentation discusses the design of the system and discusses how the/a given implementation would/does work. Assume that this type of reader has a good understanding of formal systems... and leverage the power that comes from formal language. Use predicate logic, pre/post conditions, petri-nets, state transition diagrams, what ever helps. To be nice, footnote the language with a book/url to help the reader get up to speed. Categorically ignore any complaints by "programmers" that can't understand the formal language. By dumbing down this type of documentation you strip away the essence of the field and further lower the quality standards in the industry. -- I have always found that by dividing my documentaion into these four audience categories has _always_ helped a great deal. When you mix the audience for the documentation, you end up with something that is painful to read for all parties. With SGML, you could *even* store everything in a single file, and then extract the parts to create the various documents. I havn't tried this, but I'm theorizing that it would help to improve consistency among the documents... which is always a problem. My $.02, :) Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Mon Feb 8 17:47:41 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:08:51 2004 Subject: =?iso-8859-1?Q?an_initial_idea_for_an_XML_=FCberdocument_shell__=28was_Re?= =?iso-8859-1?Q?:_CORBA's_not_boring_yet._/_XML_in_an_OS=3F=29?= Message-ID: <008801be538b$3326e860$0300000a@othniel.cygnus.uwa.edu.au> [Isn't it funny they way you can carry around a crazy idea in your head for ages and then, out of nowhere comes just the discussion to trigger externalisation] AN INITIAL IDEA FOR AN XML ?BERDOCUMENT SHELL What I would like to see initially, is a shell-like application that has an interactive command-line that takes shell-like notions such as a working directory (and the ability to change same), starting of applications, redirecting of input/output to/from files, piping to other application and applies them to an XML ?berdocument. So this "shell" would have the notion of a working element (command 'pwe' (=pwd) will tell you what the working element is). You can change working element with the command 'ce' (=cd) followed by an XPointer. Elements contain XML content *or* they could reference an unparsed entity (for the issue of whether by ENTITY attribute or XLink see below). Some unparsed entities (perhaps with an appropriate NOTATION) are applications that can be "run". Instead of files, these applications work on nodes in the ?berdocument element tree. I imagine that applications would be a lot more modular as most of them would be working on exposed data structures. Rather than a monolithic email/PIM application, you'd have simple applications (applets? no; how about application elements => "applements"). One applement would POP your mail and graft in on to an element in the ?berdocument. Another (perhaps just XT running an XSL stylesheet) would list the subject headings. Another would enable you to read email. An editor applement would let you compose a reply message and then a final applement would send the mail via SMTP. A GUI can come later, but for now, I'd love to see an implementation of what I've just described. In something like Python it should take no time at all to do. <SideBar> Is the ?berdocument a single XML document with multiple entities or more than one XML document? At first I thought that entities would provide the perfect mechanism for an XML ?berdocument to be spread over multiple files. For at least two reasons, I now suspect XLink might be the way to go: 1) you can give the links semantics which might prove to be very useful 2) you avoid the document entity != legal external parsed entity problem I raised in an early post That having been said, it is important to note that the whole point of the "?berdocument" notion is that it is logically treated (perhaps not at the XML parser level but at a level not too higher up) as a single document. Changing working element involves giving an XPointer *not* URI+XPointer. </SideBar> James -- James Tauber / jtauber@jtauber.com / www.jtauber.com Associate Researcher, Electronic Commerce Network Curtin University of Technology, Perth, Western Australia Maintainer of : www.xmlinfo.com, www.xmlsoftware.com and www.schema.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Mon Feb 8 17:49:15 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:51 2004 Subject: document vs non-document entity (was Re: CORBA's not boring yet. / XML in an OS?) References: <002101be52be$e92480a0$0300000a@othniel.cygnus.uwa.edu.au> <36BEE1B6.8108B0C9@mecomnet.de> Message-ID: <370CD761.81C6D890@prescod.net> james anderson wrote: > > This raises the question: "What is the extent of an entity binding?" > > Assume that is possible to qualify entity names. Since these qualifications > are already lexically scoped, why would one need to introduce local entity > definitions? The effective entity is already determined by a binding with a > lexical scope. The reason for local definitions is simple. Because of maintenance, visibility and usability concerns it makes sense to have an entity declaration as close as possible to the logical use of that declaration. If a "chapter entity" is to be independently authored then its entity declarations should travel with it. This is MUCH more convenient with local entities. > Opps! Yea, I forgot, there is no way to bind a prefix in the DTD (ie "the > point of definition"). Got to wait for schemas for this. I don't believe that entity declarations have any place in schemas. It seems to me that the identification of resources is a separate issue from the validation of structure. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From spreitze at parc.xerox.com Mon Feb 8 18:04:49 1999 From: spreitze at parc.xerox.com (spreitze@parc.xerox.com) Date: Mon Jun 7 17:08:51 2004 Subject: A New Hope (was Re: Storing Lots of Fiddly Bits (was Re: What is XML for?)) In-Reply-To: ""Rick Jelliffe" <ricko@allette.com.au>'s message of Fri, 5 Feb 1999 08:00:22 PST" Message-ID: <99Feb8.100345pst."834439"@idea.parc.xerox.com> > HyTime lets you label edges too... The > question is, should such labelling be part of the language at the > lexical level (which XML deals with) or a further layer. The nature of XML instances is fairly fixed now; the nature of XML schemas is being designed as we speak. I willing to conceive of an XML schema language that can express schemas against a higher-level data model, plus a mapping of that data model into XML instances. > It is the old > tradeoff that a general purpose system will (probably) be worse at any > specific task than a specific system. I didn't think we were debating whether XML instances or schemas will be general purpose --- of course they will! That's beside the point of what level(s) of abstraction and representation will be addressed by those formalisms. > But Mike's comments do betray a wish that XML operated on > some other level than the strictly lexical: but it doesn't, except by > chance. I think I've heard a number of people use the term "XML" to describe a data model independent of textual expression (and others tell them they're wrong). It's the richness of that data model (and/or its textual expressions) that I'm addressing. We've already lost the battle for XML instances; for schemas, the outcome is as yet undetermined. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Mon Feb 8 18:47:08 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:52 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) References: <19990208104624.A4862@io.mds.rmit.edu.au> Message-ID: <370CE4A4.76D43491@prescod.net> Marcelo Cantos wrote: > > ... [best of both worlds] ... > You get a nice object oriented layer on > top to talk to, and an industrial strength, robust repository > underneath. > > Your comments give me the impression that this is unacceptable to you > in the XML/heirarchical universe. You don't want DOM at any level. > You insist on going straight to objects. It is not even good enough > to build an object layer on top of the DOM layer. I find this a > little implausible and hence am certain that you had something else in > mind. Is it rather that you simply don't care what the underlying API > is, that you are only interested in what happens at the object level? If I had evidence that a bottom-level XML/"DOM" layer would "buy me" an industrial strength, robust repository then I would go for it. As you have pointed out, I can cover up the ugliness with objects. But to me, an industrial strength, robust repository implies sophisticated tree-smart *and* link-smart ad hoc query support. The DOM isn't a query language and doesn't (AFAIK) have a query interface. It might be okay as an API to the results of a query but even there I'm leery... Since trees can be built as a special case of links, I tend to look for such a beast to come out of the OO world (where links are usually primary) instead of the text processing world (where the tree is usually primary). Maybe you guys at rmit.edu can surprise me though. But note that a DOM-on-the-bottom is the opposite of the architecture that I am speaking out against. I'm concerned about people who want to layer the DOM on "top" of things that do not look substantially like XML. In that case you are covering up an optimized, purpose-built abstaction with a homogenized "dumb tree" layer. That's a step backwards. Note that even the DOM creators do not view an XML-DOM as a "universal tree API." That's why there are several variants of the DOM -- for XML, HTML, CSS etc. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Mon Feb 8 18:55:20 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:08:52 2004 Subject: Fw: "Clean Specs" Message-ID: <002101be5384$04a31e20$5402a8c0@oren.capella.co.il> Tyler Baker <tyler@infinet.com> wrote: >In effect "Namespaces in XML" either makes using the DOM completely useless. In fact, in >order for the DOM to be made useful in the presence of "Namespaces in XML" you would have to >make a lot of changes that are not backwards compatible with the Level 1 recommendation. This >in practical terms would make using the DOM in an XSL Processor pretty much pointless (all XSL >Processors I know of other than XT use the DOM as the source tree, and some even use it as the >stylesheet tree as well). If you mutate the source tree, then everything is hosed. What's wrong with doing the '^' expansion when building the DOM? Names would become context independent, but still unique, using the current interfaces. Then a bit of magic to the output module: (i) keep track of 'xmlns' attributes, and emit names accordingly and (ii) either throw an exception of invent a prefix on the fly if you encounter a namespace which wasn't declared yet. Is re-working the DOM _really_ necessary? I _really_ wish this whole namespace recommendation was specified this way from the start. Have fun, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Mon Feb 8 19:10:55 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:52 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) References: <19990208104624.A4862@io.mds.rmit.edu.au> <370CE4A4.76D43491@prescod.net> Message-ID: <36BF35E7.D6FF0E5E@manhattanproject.com> Paul Prescod wrote: > But note that a DOM-on-the-bottom is the opposite of the architecture that > I am speaking out against. I'm concerned about people who want to layer > the DOM on "top" of things that do not look substantially like XML. In > that case you are covering up an optimized, purpose-built abstaction with > a homogenized "dumb tree" layer. That's a step backwards. Note that even > the DOM creators do not view an XML-DOM as a "universal tree API." That's > why there are several variants of the DOM -- for XML, HTML, CSS etc. The only time you may want to take this step backwards is if you are providing an generic drill down tool for database navigation... Once you have the primary-key/oid for the object in question, you would most likely want to switch to a smarter, class specific interface. :) Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Mon Feb 8 19:25:08 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:52 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) References: <199902080247.TAA01750@malatesta.local> Message-ID: <370CE837.DB8E5EFC@prescod.net> uche.ogbuji@fourthought.com wrote: > > I hope I'm not mis-representing Paul here, but as I've always read him (and > agreed), his point is that XML, and the various ancillary technologies such as > DOM and XML Schema, are more appropriate for content-exchange than for core > business-object modeling. I agree! > I don't think it makes sense to build a business-object model on top of DOM, > but I do think it makes sense to define an exchange protocol that selializes > objects to XML representations using DOM as a programmatic interface. I agree. I'll point out, however, that it is REALLY EASY to generate XML directly. In your opinion does the DOM actually make it easier? If you use a "reverse SAX" interface (instead of a DOM-building interface) then you could pipe together data consumers and if any of them ever needed a DOM, it could build it. > I think it also makes sense to use the DOM to develop a user-interface layer > for such objects, possibly using the same WDDX or XML-RPC mappings in > association with a set of style-sheets (although this is just one of many > possible mechanisms). Yes, it makes sense to use XML as an "interchange language" between your business objects and your user interface. On the other hand, if that interface is meant to be editable the information loss associated with "dumbing down" to XML may not be acceptable. Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at weblogic.com Mon Feb 8 19:38:54 1999 From: peter at weblogic.com (Peter Seibel) Date: Mon Jun 7 17:08:52 2004 Subject: What Clean Specs Achieve, WAS: Colonialism, SAX, Java, and Namespaces In-Reply-To: <A26F84C9D8EDD111A102006097C4CD0D054976@SOHOS002> Message-ID: <19990208194627350.AAA267@ashbury.weblogic.com@lawton> At 09:53 AM 2/6/99 , you wrote: >Bill la Forge wrote: >> One of the big advantages of Java is that a small shop can >> tackle significant projects. With clean specs, the same will >> be true for XML. >> > >Hands up, who has read the Java spec (and that's not the same as reading >the nice clear instructions given to you by the people who wrote the >compiler)? I don't know if that was rhetorical or not, but I have. Language Spec and VM Spec. And I don't develop compilers or VMs for a living -- I'm just a random Java hacker. FWIW, they are quite readable with most of the problems of interpretation coming in places that were tacked on as part of the 1.1 release and not part of the spec proper. I'd encourage would-be spec writers to read the language spec as a example of good spec writing. And I'd encourage developers -- "average" or otherwise -- to read the specs of the technologies they use on a daily basis. -Peter -- Peter Seibel Perl/Java/English Hacker peter@weblogic.com Is Windows98 Y2K compliant? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Mon Feb 8 19:41:26 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:08:52 2004 Subject: When to use attributes vs. elements Message-ID: <5BF896CAFE8DD111812400805F1991F708AAEF4E@RED-MSG-08> Dan Brickley asks several questions in a mail of 1999-02-08 having to do with serializing graphs of data per the "canonical format" recommendations in http://www.w3.org/TandS/QL/QL98/pp/microsoft-serializing.html. Since his mail was lengthy, I have not copied it here. Let me take another stab at explaining the idea. XML has two principal ways to explicitly express a relationship among elements: containment and idrefs. Idrefs always express a directed, labeled relationship between two elements; they always have this meaning and they never have any other meaning. If elements all have ids, and the relationships between the elements in a document are all expressed via idrefs, then the document -- per normal XML rules -- corresponds to a graph in which elements match nodes and attributes match edges. Given this, one can make the suggestion that graphs _should_ be serialized in this way, nodes as elements and edges as idrefs. A reader, knowing no more conventions than the ordinary meaning of idrefs, will observe the correct graph structure. Of course, XML permits a great deal more flexibility than this. One can, for example, take advantage of contextual knowledge and use containment to imply certain kinds of edges. If one does this, then a naive reader will only observe the explicit edges, and will not be able to reconstruct the implied ones. But -- to answer Dan's second question -- this does not mean that a reader needs to have complete knowledge of the implications of the abbreviations employed. Even a naive reader will decode the graph correctly to whatever extent it is explicit, that is, to whatever extent it uses the conventions advocated in the "canonical format." The same point stated differently: If an XML instance uses a different set of conventions, a naive reader will find some elements whose relationship is to him unknown. But he will not find relationships that he interprets incorrectly. This is the main point of the paper. The paper addresses another point, and perhaps this has led Dan to some confusion. The paper notes that many XML documents will reflect graphs that could have been rendered into the canonical format but were not, even though there is a deterministic mapping from the document's syntax to canonical syntax. It goes on to note that such mapping could work well in practice, and we have a range of options for implementing it, from simple declarations in schema, to architectural forms, to XSL. But the main point of the paper was to observe that the facilities needed to express graphs already exist in XML if they are used properly. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Mon Feb 8 20:01:00 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:52 2004 Subject: DOM API References: <19990208104624.A4862@io.mds.rmit.edu.au> <370CE4A4.76D43491@prescod.net> <36BF35E7.D6FF0E5E@manhattanproject.com> Message-ID: <370CF891.21C0BDE9@prescod.net> Clark Evans wrote: > > The only time you may want to take this step backwards is if you are > providing an generic drill down tool for database navigation... > > Once you have the primary-key/oid for the object in > question, you would most likely want to switch to a > smarter, class specific interface. Note that the object oriented model already provides a feature that allows you to treat data "generically" or "specifically". Subtyping (or interface inheritance)! So what we really need is "tree node" base class or interface. That's also the answer to Simon's question about what should replace the DOM as a generalized interface for trees. All we need is a simple "tree node" interface. The grove model supports that. In fact I would argue that that is probably the most important thing that the grove model DOES support. The DOM "node class" also supports this. If you use that class along with NodeList and NamedNodeMap and ignore the rest of the DOM then you can get all of the benefit that you are going to get out of the DOM API as an abstraction. There is no need to dumb down your data to elements and attributes. Just use nodes. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Mon Feb 8 20:03:06 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:52 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) References: <001501be532c$ffeed060$d3228018@jabr.ne.mediaone.net> Message-ID: <370CF210.759B26AB@prescod.net> "Borden, Jonathan" wrote: > > > 3. Therefore we should pretend that relational databases are really DOM > > trees. > > no. if the data is tabular then use a recordset. in the specific cases when > 1) we are storing data which is naturally hierarchical. 2) when the data > needs to interface with systems which for other reasons employ DOM > interfaces Okay. We can probably all agree with this. If you have software that is expecting a DOM and you need to connect it to data that is not XML, you need to build a DOM interface. This is a different point of view from those who say: "let's build new client software using only the DOM served by data with only a DOM interface. The fact that the DOM is standardized will just make all of my interoperability problems go away." No way. If your client software and your server software had an impedence mismatch, slapping a DOM interface on both sides makes it *worse* not better. > e.g. my XSL processor us built on a DOM interface and I wish to > query the database using XQL (which happens to be built into my XSL > processor in this example), it is more convenient to interface to the data > using DOM interfaces than it is using recordsets (i.e. tabular data). It's more convenient but it's probably going to run as slow as hell. Nobody implements SQL or OQL on top of an industry-standard interface. They put it right in the core engine of their database. > Arguably, when using an ODBMS this example would be more straightforward > (but you picked RDBMS). The problem is that there is no standard, language > independent interface onto ODBMS's. ********** Yes there is! ************* It isn't as widely hyped as XML/DOM. I haven't written a book about it (and hardly has anyone else). But the standards *do* exist. Check http://www.odmg.org. There are well defined APIs, bindings in a few languages, a solid object model and a query language. It's all in there. My fear it that these technologies will get lost in the XML hype. > The DOM, while not the perfect interface > *is* standard, and this is the big utility. The DOM is a standard for accessing XML, HTML and CSS information. It isn't for modelling arbitrary business objects. It wasn't designed for that and it isn't good at that. > For example, I get to say (using 'extended DOM'): > > NodeList anotherSet = airplanes.selectNodes("airplane[@color='red' and > .//screw/thread/@pitch = 64]"); > > to select all red airplanes with screws having a pitch=64... The DOM is doing essentially nothing here. This imaginery XML query language is doing all of the work. But even the XML query language is going to make solving your problem harder than OQL would. For instance OQL can be statically type checked. XQL cannot, in general, for many subtle reasons. OQL can handle mathematical range constraints. OQL has a concept of a "stored query" that allows some level of abstraction. OQL has "local variables" also for abstraction. I don't completely follow your examples: > XMOP for example (http://jabr.ne.mediaone.net/documents/xmop.htm) is a way > to serialize arbitrary COM objects using their typeinfo metadata. XMOP is a > layer that can persist objects into either a) a stream (serialization) b) > direct-to-DOM. When I attempted to design a direct-to-Recordset persistence > interface on XMOP I found that I had to essentially develop a > DOM<->Relational mapping. This is because arbitrary objects can be modelled > in a hierarchical fashion (e.g. serialized to XML). This seems like a serialization problem. We all agree that XML is great for serialization. If your only goal was to get the data into a "database of some kind" then an OO database would have been easier than an XML database. > In another example, using the medical imaging DICOM protocol (a complex > property based protocol) I have developed a mapping to the Microsoft > PropertySet format (used with Index Server). This mapping is not clean (at > all given the inability to represent certain DICOM structures as > PROPVARIANTs). This causes similar problems in mapping the protocol to a > relational database (the workaround is to use binary data). Using XML and > the DOM was a piece of cake to solve this difficult problem. I'm not at all clear on how the DOM solved this impedence mismatch. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Mon Feb 8 20:30:38 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:52 2004 Subject: Fw: "Clean Specs" References: <002101be5384$04a31e20$5402a8c0@oren.capella.co.il> Message-ID: <36BF4917.7C9B6B35@infinet.com> Oren Ben-Kiki wrote: > Tyler Baker <tyler@infinet.com> wrote: > >In effect "Namespaces in XML" either makes using the DOM completely > useless. In fact, in > >order for the DOM to be made useful in the presence of "Namespaces in > XML" you would have to > >make a lot of changes that are not backwards compatible with the Level 1 > recommendation. This > >in practical terms would make using the DOM in an XSL Processor pretty much > pointless (all XSL > >Processors I know of other than XT use the DOM as the source tree, and some > even use it as the > >stylesheet tree as well). If you mutate the source tree, then everything > is hosed. > > What's wrong with doing the '^' expansion when building the DOM? Names would > become context independent, but still unique, using the current interfaces. What I am saying here is that if you make a call to: Document.createElement(String name); What are you to do here? Do you throw in an expanded name, or do you throw in a QName which should be expanded by the DOM? These are not clarified: For example if I wanted to create an xsl:text element for a stylesheet, would I do something like: String prefix = "xsl"; String localPart = "text"; Document.createElement(prefix + ":" + localPart); or else would I do something like: String namespace = "http://www.w3.org/TR/WD-xsl"; String localPart = "text"; Document.createElement(namespace + ":" + localPart); Of course the forward slash character is not a valid name character so you are pretty much screwed as far as this is concerned. You would need to change the Document interface to have createElement() be of the form: Document.createElement(String prefix, String namespace, String localPart); The prefix would be there for backwards compatibility. Namespaces are given as attribute values and can in essence be character string you want. This is not the case with the Name production in the XML 1.0 spec. > Then a bit of magic to the output module: (i) keep track of 'xmlns' > attributes, and emit names accordingly and (ii) either throw an exception of > invent a prefix on the fly if you encounter a namespace which wasn't > declared yet. Is re-working the DOM _really_ necessary? Reading in XML and writing out XML is simple enough from the XML library developer's perspective. Dealing with namespaces at the application level is a totally different ballgame. The fact that some of the more relevant people here cannot see the obvious complexities of "Namespaces in XML" when dealing with it at the application level leads me to believe they either don't give a hoot about end-users using XML, or that they are out of touch with the great majority of the developer community and web users in general. > I _really_ wish this whole namespace recommendation was specified this way > from the start. The namespaces recommendation is relatively simple to understand from my perspective as an XML tools developer (well not that simple as I had to reread the August draft about 10 times to digest it all), however it opens many new cans of worms for all sorts of applications that wish to use and support XML. It is that middle area between when XML is parsed into some useful data structure and then reserializes it back to XML (i.e. the application) that really concerns me here. The way I see it, the current recommendation does not even truly solve the unique naming problem in XML content, but instead just adds an extra layer of indirection and fragmentation to XML. In the end, all this ever does is make the task of using XML in web and e-commerce applications much more difficult in the long run. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Feb 8 20:42:10 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:52 2004 Subject: Fw: "Clean Specs" Message-ID: <3.0.32.19990208124157.00c0bd60@pop.intergate.bc.ca> At 03:29 PM 2/8/99 -0500, Tyler Baker wrote: >What I am saying here is that if you make a call to: > >Document.createElement(String name); > >What are you to do here? Do you throw in an expanded name, or do you throw in a QName which >should be expanded by the DOM? These are not clarified: The current rev of the DOM spec has no support for namespaces. Tyler Baker is correct in his statements that until it does, wrangling namespace-containing documents via the DOM is going to be a pain in the butt. I kind of suspect the DOM folks are hearing about this from several directions. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Mon Feb 8 20:46:06 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:08:52 2004 Subject: Fw: Fw: "Clean Specs" Message-ID: <007d01be53a3$494bed80$5402a8c0@oren.capella.co.il> Tyler Baker <tyler@infinet.com> wrote: >What I am saying here is that if you make a call to: > >Document.createElement(String name); > >What are you to do here? Do you throw in an expanded name, That's what I had in mind. >or do you throw in a QName which >should be expanded by the DOM? Not using the current API. >or else would I do something like: > >String namespace = "http://www.w3.org/TR/WD-xsl"; >String localPart = "text"; >Document.createElement(namespace + ":" + localPart); Using '^' instead of ':', that's exactly what I had in mind. >Of course the forward slash character is not a valid name character so you are pretty much >screwed as far as this is concerned. If the only change we'd be requiring in the DOM is allowing '/' in names before a '^', then we are in a pretty good shape :-) >You would need to change the Document interface to have >createElement() be of the form: > >Document.createElement(String prefix, String namespace, String localPart); I wouldn't mind such a change - or other extensions to the API to _better_ support namespaces. I'm just not convinced that you couldn't _make due_ with the current API in the mean while (barring small relaxation in what is valid in a name). >The prefix would be there for backwards compatibility. Just don't delete the 'xmlns' attributes when extending the names. The output XML write would be able to use them as a guide to how to generate the output XML. Future API calls may use these attributes to provide more convenient namespace specific functionality. So? >> I _really_ wish this whole namespace recommendation was specified this way >> from the start. To clarify: If it was specified as being equivalent to a purely textual transformation, then only very slight modifications or none at all would have been required for current APIs and standards. Yes, XSL would need namespace matching patterns. Yes, _in-memory_ names could include anything until a '^' character, and only from then would be limited by XML 1.0 rules. These seem relatively minor changes. Otherwise, things would just go on working. Why make this more complex then it has to be? Share & Enjoy, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rschoening at unforgettable.com Mon Feb 8 20:49:34 1999 From: rschoening at unforgettable.com (Rob Schoening) Date: Mon Jun 7 17:08:52 2004 Subject: What Clean Specs Achieve, WAS: Colonialism, SAX, Java, and Namespaces In-Reply-To: <19990208194627350.AAA267@ashbury.weblogic.com@lawton> References: <19990208194627350.AAA267@ashbury.weblogic.com@lawton> Message-ID: <0003436060bae2e9_mailit@mail.ptld.uswest.net> >>Bill la Forge wrote: >>> One of the big advantages of Java is that a small shop can >>> tackle significant projects. With clean specs, the same will >>> be true for XML. >>> >> >>Hands up, who has read the Java spec (and that's not the same as reading >>the nice clear instructions given to you by the people who wrote the >>compiler)? This spec debate is missing the point. Languages like C and Java have clean specs because the language design is also clean. The fact that their specs are clean follows from the fact that the people who wrote the clean language will write a clean spec! XML's problems do not turn on its specs. The central problem is that the question "What is XML?" has no definite answer. Between XML proper, Namespaces, DOM, SAX, XSL, et al, it seems pretty obvious to me that there is nothing unifying this chaos. Java has its core classes. C has C library. C++ has the C library and STL. Unix has unix tools! Perl has its standard modules. XML has nothing of the kind. I hate to sound pessimistic, but if things are left to evolve this way, XML is going to become just another open-standard *file format*. XML desperately needs an XDK. Moreover, the XDK's component parts need to be developed as if they were part of a whole. Rob xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ti64877 at imcnam.sbi.com Mon Feb 8 20:58:20 1999 From: ti64877 at imcnam.sbi.com (Ingargiola, Tito) Date: Mon Jun 7 17:08:52 2004 Subject: XML-RPC & HTTP-NG (was A critique of XML-RPC) Message-ID: <3994C79D0211D211A99F00805FE6DEE249BF84@exchny15.corp.smb.com> Hi, XML-RPC sounds nice to me: simple, light-weight and immediately applicable to a good set of problems. I looked over some of the w3's pages on HTTP-NG, however, and it looks like there's a good deal of overlapping functionality between these two proposed standards. Why is this, and what's the intended relation between the two (if any?) Are you investigating HTTP-NG support in Frontier? Thanks for any insights. Regards, Tito. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Feb 8 21:02:01 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:08:52 2004 Subject: What Clean Specs Achieve, WAS: Colonialism, SAX, Java, and Namespaces In-Reply-To: <0003436060bae2e9_mailit@mail.ptld.uswest.net> from "Rob Schoening" at Feb 8, 99 12:35:48 pm Message-ID: <199902082152.QAA23565@locke.ccil.org> Rob Schoening scripsit: > XML has nothing of the kind. I hate to sound pessimistic, but if things > are left to evolve this way, XML is going to become just another > open-standard *file format*. Yes. That's what it is! That's *exactly* what XML is! -- John Cowan cowan@ccil.org e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Feb 8 21:11:05 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:53 2004 Subject: What Clean Specs Achieve, WAS: Colonialism, SAX, Java, and Namespaces Message-ID: <3.0.32.19990208131027.00c0d7c0@pop.intergate.bc.ca> At 12:35 PM 2/8/99 -0800, Rob Schoening wrote: > XML has nothing of the kind. I hate to sound pessimistic, but if things >are left to evolve this way, XML is going to become just another >open-standard *file format*. Uh, and your point is? -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Mon Feb 8 21:13:48 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:53 2004 Subject: XDK References: <19990208194627350.AAA267@ashbury.weblogic.com@lawton> <0003436060bae2e9_mailit@mail.ptld.uswest.net> Message-ID: <36BF522D.599E009D@manhattanproject.com> Rob Schoening wrote: > > XML desperately needs an XDK. Moreover, the XDK's component parts > need to be developed as if they were part of a whole. This would be _wonderful_ for example, SAX and DOM should be designed/written so that they are complementary. So that: I could do: SAX -> DOM -> SAX' Where Events(SAX') = Events(SAX) and DOM -> SAX -> DOM' Where State(DOM') = State(DOM) :) Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Mon Feb 8 21:18:04 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:53 2004 Subject: What Clean Specs Achieve, WAS: Colonialism, SAX, Java, and Namespaces References: <199902082152.QAA23565@locke.ccil.org> Message-ID: <36BF5368.D709DEE3@manhattanproject.com> John Cowan wrote: > Rob Schoening scripsit: > > XML has nothing of the kind. I hate to sound pessimistic, but if things > > are left to evolve this way, XML is going to become just another > > open-standard *file format*. > > Yes. That's what it is! That's *exactly* what XML is! > I agree. It's a hierarchical file syntax. It's much more flexible than CVS, INI, DB3, etc. However, it would be nice to have a standard API for dealing with streams using this syntax. *smile* :) Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Mon Feb 8 21:25:02 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:53 2004 Subject: Fw: Fw: "Clean Specs" References: <007d01be53a3$494bed80$5402a8c0@oren.capella.co.il> Message-ID: <36BF55D7.3C196F2D@infinet.com> Oren Ben-Kiki wrote: > Tyler Baker <tyler@infinet.com> wrote: > >or else would I do something like: > > > >String namespace = "http://www.w3.org/TR/WD-xsl"; > >String localPart = "text"; > >Document.createElement(namespace + ":" + localPart); > > Using '^' instead of ':', that's exactly what I had in mind. Fair enough (but not compatible with XML 1.0). > >Of course the forward slash character is not a valid name character so you > are pretty much > >screwed as far as this is concerned. > > If the only change we'd be requiring in the DOM is allowing '/' in names > before a '^', then we are in a pretty good shape :-) This is only one of many characters. A namespace can be anything you want it to be. It could be just about any sequence of unicode characters you can think of. You would have to restrict a namespace to be only valid NCName characters. > >You would need to change the Document interface to have > >createElement() be of the form: > > > >Document.createElement(String prefix, String namespace, String localPart); > > I wouldn't mind such a change - or other extensions to the API to _better_ > support namespaces. I'm just not convinced that you couldn't _make due_ with > the current API in the mean while (barring small relaxation in what is valid > in a name). Above example explains the need. > >The prefix would be there for backwards compatibility. > > Just don't delete the 'xmlns' attributes when extending the names. The > output XML write would be able to use them as a guide to how to generate the > output XML. Future API calls may use these attributes to provide more > convenient namespace specific functionality. So? xmlns: attributes are inherited. When you copy and clone nodes all over the place (one application of XSL I know of does this when constructing the source tree programmatically) you totally lose track of all of this stuff. Things in effect become unmanageable. > >> I _really_ wish this whole namespace recommendation was specified this > way > >> from the start. > > To clarify: If it was specified as being equivalent to a purely textual > transformation, then only very slight modifications or none at all would > have been required for current APIs and standards. Yes, XSL would need > namespace matching patterns. Yes, _in-memory_ names could include anything > until a '^' character, and only from then would be limited by XML 1.0 rules. > These seem relatively minor changes. Otherwise, things would just go on > working. > > Why make this more complex then it has to be? I agree, however the "Namespaces in XML" introduces so many problems that avoiding complexity at the application level is practically unavoidable as things currently stand. This is very unfortunate. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Mon Feb 8 21:28:56 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:53 2004 Subject: Fw: "Clean Specs" References: <3.0.32.19990208124157.00c0bd60@pop.intergate.bc.ca> Message-ID: <36BF5664.C0100DC5@infinet.com> Tim Bray wrote: > At 03:29 PM 2/8/99 -0500, Tyler Baker wrote: > >What I am saying here is that if you make a call to: > > > >Document.createElement(String name); > > > >What are you to do here? Do you throw in an expanded name, or do you throw in a QName which > >should be expanded by the DOM? These are not clarified: > > The current rev of the DOM spec has no support for namespaces. > Tyler Baker is correct in his statements that until it does, > wrangling namespace-containing documents via the DOM is going > to be a pain in the butt. > > I kind of suspect the DOM folks are hearing about this from > several directions. -Tim Can anyone within the W3C verify that there are there any plans for taking DOM Level 1 past the current recommendation. DOM Level 1 did not exactly have a 1.0 status as far as I remember. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From davec at progress.com Mon Feb 8 21:32:22 1999 From: davec at progress.com (David E. Cleary) Date: Mon Jun 7 17:08:53 2004 Subject: XML Parser with DOM in C Message-ID: <000301be53a9$9ca7e5a0$426712ac@cleary400.bedford.progress.com> I'm looking for a C based XML parser with the DOM API to license. Doesn't have to be free or open source. Or is anybody working on putting the DOM on top of expat besides Mozilla? David Cleary xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Mon Feb 8 21:35:09 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:08:53 2004 Subject: XDK Message-ID: <004801be53aa$323bfb10$2ee044c6@arcot-main> >I could do: > SAX -> DOM -> SAX' Where Events(SAX') = Events(SAX) >and DOM -> SAX -> DOM' Where State(DOM') = State(DOM) Actually most of this can already be done. Docuverse DOM SDK comes with both SAXReader and SAXWriter where SAXReader handles SAX -> DOM and SAXWriter handles DOM -> SAX. I believe IBM also has something similar in their XML utility kit. Best, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Mon Feb 8 21:35:56 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:08:53 2004 Subject: XDK In-Reply-To: <36BF522D.599E009D@manhattanproject.com> Message-ID: <000701be53aa$2c2dd720$5118a8c0@kuantech1.quokka.com> Here here! I started a similar thread awhile ago. Include the whole family in the mix (XML, namespaces, schemas, RDF, SAX, DOM, ad nauseum), and you've got something. I think the XML universe really points more to a "programming language"/SDK model than a "markup" model. Jeff -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of Clark Evans Sent: Monday, February 08, 1999 1:08 PM To: Rob Schoening Cc: xml-dev@ic.ac.uk Subject: XDK Rob Schoening wrote: > > XML desperately needs an XDK. Moreover, the XDK's component parts > need to be developed as if they were part of a whole. This would be _wonderful_ for example, SAX and DOM should be designed/written so that they are complementary. So that: I could do: SAX -> DOM -> SAX' Where Events(SAX') = Events(SAX) and DOM -> SAX -> DOM' Where State(DOM') = State(DOM) :) Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Mon Feb 8 21:40:13 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:53 2004 Subject: What Clean Specs Achieve, WAS: Colonialism, SAX, Java, and Na mespaces Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054990@SOHOS002> Rob Schoening wrote: > > XML has nothing of the kind. I hate to sound pessimistic, > but if things > >are left to evolve this way, XML is going to become just another > >open-standard *file format*. Hey, great! Is someone going to invent an open-standard file format? About time too. (Seriously, I'm looking for the pessimistic bit.) Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From spreitze at parc.xerox.com Mon Feb 8 21:45:56 1999 From: spreitze at parc.xerox.com (spreitze@parc.xerox.com) Date: Mon Jun 7 17:08:53 2004 Subject: A New Hope (was Re: Storing Lots of Fiddly Bits (was Re: What is XML for?)) In-Reply-To: "spreitze's message of Thu, 4 Feb 1999 00:32:43 PST" Message-ID: <99Feb8.134316pst."834439"@idea.parc.xerox.com> Aw heck, I made an important typo in my message: I wrote "XML's entity structure" where I meant to refer to XML's *element* structure. In short, I'm advocating making it possible to write XML schemas in a (logically, probably not physically) factored way: one part expresses a schema in terms of a fully labelled graph, and the other part expresses a particular mapping of the fully labelled graph data model into XML instances. I brought this up here because the discussion of storing XML in databases touched on the topic of the difference in data models between databases and XML instances. If XML schemas were factored as I suggest, they could describe data in a data model (fully labelled graphs) that is much closer, if not identical, to that of many general-purpose databases (as well as the way many people --- though demonstrably not all --- prefer to think). Such a factored schema, augmented by a small mapping-to-DB's-datamodel part if necessary, could be used to facilitate automatic and efficient translation between (1) XML instances and (2) application data in general-purpose databases. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Mon Feb 8 21:49:50 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:08:53 2004 Subject: Fw: "Clean Specs" Message-ID: <00b201be53ab$e5a150a0$5402a8c0@oren.capella.co.il> Tyler Baker <tyler@infinet.com> wrote: >Of course the forward slash character is not a valid name character so you >are pretty much screwed as far as this is concerned.... ... >This is only one of many characters. A namespace can be anything you want it to be. It could >be just about any sequence of unicode characters you can think of. You would have to restrict >a namespace to be only valid NCName characters. Sigh. OK, let's get formal: A valid extended name would be one of: - A valid XML 1.0 name without any ':' character; or - A prefix containing any character at all except for '^', followed by a single '^', followed by a valid XML 1.0 name without any ':' characters. >> >You would need to change the Document interface to have >> >createElement() be of the form: >> > >> >Document.createElement(String prefix, String namespace, String localPart); >> >> I wouldn't mind such a change - or other extensions to the API to _better_ >> support namespaces. I'm just not convinced that you couldn't _make due_ with >> the current API in the mean while (barring small relaxation in what is valid >> in a name). > >Above example explains the need. Sorry, I still don't see why. >> >The prefix would be there for backwards compatibility. >> >> Just don't delete the 'xmlns' attributes when extending the names. The >> output XML write would be able to use them as a guide to how to generate the >> output XML. Future API calls may use these attributes to provide more >> convenient namespace specific functionality. So? > >xmlns: attributes are inherited. When you copy and clone nodes all over the place (one >application of XSL I know of does this when constructing the source tree programmatically) you >totally lose track of all of this stuff. Things in effect become unmanageable. Sorry? Since all the in-memory names are extended, you can cut and paste to you heart's content without effecting validity. Prefixes only become interesting when you parse/emit the nodes. The output module would use 'xmlns' attributes left over from the input or added during the mutations as a guide to which prefixes to use, and if failing that would invent some prefixes of its own. This way, if the prefixes in the output matter a lot to you, just add 'xmlns' attributes where needed. Again, future APIs could help attach these 'xmlns' attributes where appropriate. But existing non-namespace-aware programs _will go on working_. How is that unmanagable? >> Why make this more complex then it has to be? > >I agree, however the "Namespaces in XML" introduces so many problems that avoiding complexity >at the application level is practically unavoidable as things currently stand. This is very >unfortunate. I guess we'll just have to agree to disagree on this one - unless you can come up with a concrete example. Share & Enjoy, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Feb 8 22:07:53 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:53 2004 Subject: URL: 19 Short Questions about Namespaces (with Answers) Message-ID: <14015.24464.658458.854635@localhost.localdomain> In response to several private requests, I've recast my posting into a simple HTML page at the following location: http://www.megginson.com/docs/namespace-questions.html All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Feb 8 22:12:00 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:08:53 2004 Subject: Fw: "Clean Specs" In-Reply-To: <00b201be53ab$e5a150a0$5402a8c0@oren.capella.co.il> from "Oren Ben-Kiki" at Feb 8, 99 11:42:16 pm Message-ID: <199902082302.SAA26795@locke.ccil.org> Oren Ben-Kiki scripsit: > - A prefix containing any character at all except for '^', followed by a > single '^', followed by a valid XML 1.0 name without any ':' characters. Actually the prefix has to be a valid URI, which means it can contain only [A-Za-z0-9$_.+!*'(),;/?:@&=-] plus hex escapes "%[0-9a-fA-F][0-9a-fA-F]". -- John Cowan cowan@ccil.org e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jabuss at cessna.textron.com Mon Feb 8 22:36:11 1999 From: jabuss at cessna.textron.com (Buss, Jason A) Date: Mon Jun 7 17:08:53 2004 Subject: "Clean Specs" Message-ID: <F7E1775C1C27D211881F00A024B2853046A01E@CESS01AMX03> >Perhaps in this modern world, some of the rather large fees charged >by W3C for membership could go towards hiring some technical writers >to address this issue. IMNSHO, the amount of time that we've all >spent thrashing about with namespaces is an example of intelligence, >time and energy that could have been avoided by a standard that >addressed some of the issues better. >If standards are the way we'll do business (and I'm all for that!) >then why not invest in the best possible standards up front? Just >because IETF and other traditions made do without, doesn't mean that >we should be penny wise and pound foolish now. Clarity is a net gain >for W3C members, and for the larger community, as the cost of >incompatible implementations is significant. Avi At 5:40 PM -0600 2/7/99, W. Eliot Kimber wrote: > The XML WG was an all-volunteer project, as are most standards efforts. > Those of us who participated did so primarily as a personal commitment, not > as something our employers (those of us who have them) pay us to do. > > Standards development is not a commercial process--there is no budget from > which technical writers might be hired. The W3C only administers, it does > not fund. Same for ISO. Some national bodies do fund some standards > development (BSI, the British Standards Institute), but that funding will > tend to be used to support the technologists developing the standard and > not writers crafting the words. > > So while it's true that most, if not all, specifications could benefit from > professional writers, it usually isn't an option for standards developers > > Well, has anyone considered employing real, professional technical > > authors to write the specifications? > >As chair of the DOM WG, I (and I think the editors of the specs) >would be overjoyed were someone to volunteer the services of a >real, professional technical author who could help in the process of >getting good specs out the door. However, as has been pointed out >by others on this list, this support is difficult to find, as W3C >seldom has these resources available. Maybe it is time some of us who have been "put off" by the way the Namespaces recommendation to offer our services, under the auspices of the WG for XML and XML related standards, to go through and annotate the drafts and recommendations, as they come up for the vote. I didn't have trouble with the XML recommendation or the XSL working draft. The DOM took me a couple of reads, and I have read the namespaces recommendation 3 times and still have some questions, but I am looking here and other places to find the answers before I climb up in here and get all surly with the spec writers. I know there are a number of people who have read the spec and are upset with the concept of namespaces. I am still trying to grasp parts of it myself. But I think a lot of this is because I am a technical writer by trade. I prepare documents for the end-user. I am conditioned to write things from the perspective of the person actually utilizing the documents; I still wince at typos. If I hadn't had the background in SGML that I have, I would have been lucky to get past the XML spec itself. IMHO, if the working groups would like to see the services of technical writers utilized, they should probably just come forward and ask. I imagine through the W3C site or something. I think I have seen postings from Paul saying he had been working on annotated versions of the recommendations. If tech writers would like to see this, and it appears that the WG's would appreciate the help, I don't see why efforts could be made towards this. I know I would probably take up the opportunity to do such work, even if it is on a voluntary basis. Even if some don't have the time, surely someone would even a small amount of time to analyze and make some notes, so if someone becomes available, they could come in with something to start from. Even if it took a series of writers throughout the development process, the outcome would likely justify the effort. Any suggestions? comments? Jason A. Buss Single Engine Technical Publications Cessna Aircraft Co. jabuss@cessna.textron.com "I don't have your solution, but I do admire your problem..." xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rschoening at unforgettable.com Mon Feb 8 22:44:39 1999 From: rschoening at unforgettable.com (Rob Schoening) Date: Mon Jun 7 17:08:53 2004 Subject: What Clean Specs Achieve, WAS: Colonialism, SAX, Java, and Namespaces In-Reply-To: <3.0.32.19990208131027.00c0d7c0@pop.intergate.bc.ca> References: <3.0.32.19990208131027.00c0d7c0@pop.intergate.bc.ca> Message-ID: <00034361ff55743d_mailit@mail.ptld.uswest.net> >At 12:35 PM 2/8/99 -0800, Rob Schoening wrote: >> XML has nothing of the kind. I hate to sound pessimistic, but if things >>are left to evolve this way, XML is going to become just another >>open-standard *file format*. > >Uh, and your point is? -T. My point is that we have no name for XML-related technologies other than...XML. This is a misnomer, and I think it's creating problems. Perhaps I have gotten ahead of myself, but I thought the cornerstone of XML was the structured document, not its representation. I thought that there was an implicit distiction being made between content and structure. The english language sentence "Grass is green" is an expression of the proposition that grass is green. Similarly, I thought that: <plant> Grass </plant> is <color> green </color> was simply an XML/english sentence expressing not only the above proposition, but also some explicit markup relations, namely that "Grass" is in a plant context and "green" is in a color context. So, while XML (the language) may be just a file format, XML (in the colloquial sense) is much more! Unfortunatey it seems that that W3C is focusing on the XML language, not XML in general. The consequence of this is that DOM ends up with a lesser status than XML. A DOM tree is the in-memory representation of the structured XML *content*, not the XML document itself. This distinction is critical! I fear that the XML language has become more important than the data that it represents. XML technologies have to potential to allow us to have a common representation of data on-disk, in-memory and across-the-wire. This is really powerful. But until we get unstuck from the linguistic details of the first, we will have nothing more than yet another file format. My proposal is that we invent a name for all these related technologies so that the relations between them will be clearer. Java is a good example. Sun is pretty clear about how the language, VM, specs, and APIs fit under the Java(tm) label. Bill Gates can acknowledge whichever portions that he chooses, but Sun has made the official picture quite clear. Until the relationships between all these XML technologies are laid out, I'm afraid that the focus on the XML language as the centerpiece will skew the whole effort and limit its potential. Rob xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Feb 8 22:58:15 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:53 2004 Subject: URL: 19 Short Questions about Namespaces (with Answers) Message-ID: <3.0.32.19990208145800.00b99390@pop.intergate.bc.ca> At 05:06 PM 2/8/99 -0500, David Megginson wrote: >In response to several private requests, I've recast my posting into a >simple HTML page at the following location: David, this is a terrific piece of work. One of your answers makes me nervous: [Q] What is the name of the third attribute in the example above? [A] The name is xmlns:z from the XML 1.0 perspective; from the Namespaces perspective, this attribute is a declaration. Somehow the part beginning "... from the Namespaces perspective" feels a bit like the question is being dodged: I'm spinning my wheels a bit in trying to suggest something better. The point that's not being made is that in a namespace-aware system, the xmlns:z attribute is possibly just not there. For example (I just checked) if you use expat and enable namespace processing, you just don't see the xmlns attributes - I don't think this is compulsory, but it also isn't surprising. I'll try to think this over some more, but somehow [Q] What's it's name? [A] It's a declaration. doesn't quite feel satisfying. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Mon Feb 8 23:13:36 1999 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:08:54 2004 Subject: "Clean Specs" References: <36BDAF1D.A908F8EF@prescod.net> <199902081614.IAA02859@sqwest.bc.ca> Message-ID: <36BF6F7C.34F197CA@allette.com.au> Lauren Wood wrote: > On 8 Feb 99, at 9:14, James Robertson wrote: > > > Well, has anyone considered employing real, professional technical > > authors to write the specifications? > > As chair of the DOM WG, I (and I think the editors of the specs) > would be overjoyed were someone to volunteer the services of a > real, professional technical author who could help in the process of > getting good specs out the door. However, as has been pointed out > by others on this list, this support is difficult to find, as W3C > seldom has these resources available. Should we assume from this that TBL's day ends with emptying the rubbish bins and vacuuming the office? James' suggestion is right on the money - this shouldn't be considered to be a luxury, it should be an integral part of the process. To me, this whole debate taking on the feel of a programmer being thrown to the mercy of the client while the project manager stands by watching. Even if Tim Bray did have doubts about the namespace recommendation (and I have no reason to believe that this is the case), he (like anyone else) feels that he has no option but to defend what we consider to be at least partly "his work". We shouldn't be snarling at Tim and he shouldn't be snarling back - the whole process should be elevated to discussion between the wider community (the client) and the W3C. Obviously Tim's input would be valued (as would that of the programmer), but any dissatisfaction needs to be directed at the organisation. I'm hardly an expert on the workings of the W3C, but it appears that this form of interface doesn't exist. The developer community aren't the only victims of the W3C process - it appears that the lack of support results in those who write the recommendations also joining this hallowed group. If the W3C wants to retain ownership (or whatever you want to call it) of the initiatives, then they need to provide support for the creators and accept responsibility for the results. This would result in a) discussions related to phrasing and construction of the recommendation being directed to one trained in the creation of spec documents, and b) technical discussion being directed to an organisation, not an individual who might (theoretically) have difficulty abstracting best practice from satisfaction in the finished product. Where's the project manager? -- Regards, Marcus Carr email: mrc@allette.com.au ___________________________________________________________________ Allette Systems (Australia) www: http://www.allette.com.au ___________________________________________________________________ "Everything should be made as simple as possible, but not simpler." - Einstein xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Mon Feb 8 23:44:41 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:54 2004 Subject: What Clean Specs Achieve, WAS: Colonialism, SAX, Java, and Namespaces Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054992@SOHOS002> Rob Schoening wrote: > The english language sentence "Grass is green" is an expression of the > proposition that grass is green. Similarly, I thought that: > > <plant> Grass </plant> is <color> green </color> > > was simply an XML/english sentence expressing not only the above > proposition, but also some explicit markup relations, namely > that "Grass" is > in a plant context and "green" is in a color context. Not sure why this should be the case. Surely it expresses whatever the application interpreting the data says it expresses - which may or may not be the proposition "Grass is green". This may for example be a description of the fact that the atomic power plant, code-named 'Grass' is no longer in emergency status, and has returned to 'green'. All XML has done is saved you from devising your own file structure, tagging format, or whatever. Anyone who has ever worked on streams of data, say like MarketLine in the Stock Exchange, or wherever, will know that you spend half your life trying to come up with clever ways of packaging up data: SOH 0xFF EOH SOM 0xFE G R A S S 0xFD 0x20 I S 0x20 0xFC G R E E N 0xFB EOM SOH is start-of-header, SOM is start-of-message, 0xFE means here comes a 'plant' and 0xFD means 'that's the end of your plant', and so on. And then of course some smart-arse would come along and say, I need a 0xFE byte in my 'plant' field, so you'd have to invent a way of escaping it so that it didn't mean 'here is a plant' in certain situations. And then someone would decide that it was a serious error if a colour field contained a plant field, or there should only be one colour per plant in any given message, and ... whoa! You spent most of your coding time writing parsers! The same goes for file formats - every time you started a new project you devised a new file format. > So, while XML (the language) may be just a file format, XML (in the > colloquial sense) is much more! As I'm saying - 'just' a file format is a miracle, be grateful! Try coming at it from another direction - if it is such a trivial thing, then how come it took so long to get here? > Unfortunatey it seems that that W3C is focusing on the XML > language, not XML > in general. And THAT is their major contribution. If everyone sat around saying let's devise the standard for video, and let's devise another standard for sound, and another for the height of basketball players, and no-one said, "hang on, let's standardise the standards", we would have nothing to argue about on this list! > XML technologies have to potential to allow us to have a common > representation of data on-disk, in-memory and > across-the-wire. This is > really powerful. But until we get unstuck from the > linguistic details of > the first, we will have nothing more than yet another file format. No - it has the potential to have a standard *interface* between these things. If I want to store my data in reverse-Polish Hamming-coded object structures, using the eyelids of lizards to represent base-4 numbers (two lizards per byte), then that's up to me. I can exchange the data with you though, by hopping out to XML and sending you my document and a DTD. And you don't need to write a JDBC-to-lizard interface to send the data back to me, either. You don't care what I use, you just send me an XML document, knowing that because the language is well-defined, I can at least parse it. > Until the relationships between all these XML technologies > are laid out, I'm > afraid that the focus on the XML language as the centerpiece > will skew the > whole effort and limit its potential. But the technologies are only just beginning to be developed - thanks to the standardisation of standards - so how can you possibly lay these relationships out without being prescriptive. RDF, for example, is not 'part' of XML, it is a standard that *uses* XML to provide an easily understandable interface to the information it can represent. And new interfaces will be developed at an increasingly rapid rate. As they say, you ain't seen nothing yet! Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Tue Feb 9 00:04:28 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:54 2004 Subject: What Clean Specs Achieve, WAS: Colonialism, SAX, Java, and Namespaces Message-ID: <3.0.32.19990208160244.00a60680@pop.intergate.bc.ca> At 11:51 PM 2/8/99 -0000, Mark Birbeck wrote: >If I want to store my data in reverse-Polish Hamming-coded >object structures, using the eyelids of lizards to represent base-4 >numbers (two lizards per byte), then that's up to me. You'd better not do that; it's covered by US Patent 4,234,611. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From daniela at cnet.com Tue Feb 9 00:14:32 1999 From: daniela at cnet.com (Daniel Austin) Date: Mon Jun 7 17:08:54 2004 Subject: What Clean Specs Achieve, WAS: Colonialism, SAX, Java, and Na mespaces Message-ID: <77A952A6B467D211855D00805F9521F1149251@cnet10.cnet.com> > -----Original Message----- > From: Tim Bray [mailto:tbray@textuality.com] > Sent: Monday, February 08, 1999 4:04 PM > To: Mark Birbeck; xml-dev@ic.ac.uk > Subject: RE: What Clean Specs Achieve, WAS: Colonialism, SAX, > Java, and > Namespaces > > > At 11:51 PM 2/8/99 -0000, Mark Birbeck wrote: > >If I want to store my data in reverse-Polish Hamming-coded > >object structures, using the eyelids of lizards to represent base-4 > >numbers (two lizards per byte), then that's up to me. > > You'd better not do that; it's covered by US Patent 4,234,611. -Tim > I checked; this one is patented by Microsoft too. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Tue Feb 9 00:15:28 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:08:54 2004 Subject: What Clean Specs Achieve, WAS: Colonialism, SAX, Java, and Na mespaces Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054993@SOHOS002> Tim Bray wrote: > Mark Birbeck wrote: > >If I want to store my data in reverse-Polish Hamming-coded > >object structures, using the eyelids of lizards to represent base-4 > >numbers (two lizards per byte), then that's up to me. > > You'd better not do that; it's covered by US Patent 4,234,611. -Tim > Ha! I was sitting here waiting to see who was going to take the trouble to pull me up on miscounting the number of lizards that make up a byte, and I thought this was going to be it. Well done everyone for resisting the temptation! Regards, Mark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From srn at techno.com Tue Feb 9 00:41:01 1999 From: srn at techno.com (Steven R. Newcomb) Date: Mon Jun 7 17:08:54 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) In-Reply-To: <199902080533.WAA02032@malatesta.local> (uche.ogbuji@fourthought.com) References: <199902080533.WAA02032@malatesta.local> Message-ID: <199902090035.SAA01108@bruno.techno.com> [ Eliot Kimber: ] > > Or said another way: there's no magic in the DOM (or groves or XML) that > > will make storing and managing business objects easier. [ Uche Ogbuji: ] > I think this is the crux of the matter, and exactly what the "XML is not a > universal hammer" folks (myself included) have been trying to get across. Basically, I agree with these sentiments. However, I would not want the casual reader to take away from Eliot's excellent rant the idea that the relationship between objects and their XML serializations need be entirely arbitrary. The grove paradigm permits interchangeable information and ready-to-use objects to be quite precise, rigorous, and non-arbitrary reflections of one another. The exact nature of the relationship between the two kinds of information can be expressed in an Architecture Definition Document. A rigorous Architecture Definition Document ideally contains: (1) a DTD (the formal description of the interchangeable form of the information), (2) a Property Set (the formal description of the ready-to-use objects and their relationships to one another), and (3) natural language text that explains the nature of the relationship between the two, if necessary including algorithms that describe the transformation between them. In effect, a Property Set can describe a specialized DOM to the specialized meanings of a particular class of information assets, while a DTD can describe the interchangeable form of such assets. This means that, in those applications of XML in which absolute precision and uniformity is required in an open, multivendor environment, and when the high cost of thinking carefully about the exact nature of the information set being interchanged can be afforded, there is a very good answer, and there is an internationally standard way to express and publish the necessary constraints. -Steve -- Steven R. Newcomb, President, TechnoTeacher, Inc. srn@techno.com http://www.techno.com ftp.techno.com voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137) fax +1 972 994 0087 (at ISOGEN: +1 214 953 3152) 3615 Tanner Lane Richardson, Texas 75082-2618 USA xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Tue Feb 9 01:08:46 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:08:54 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) In-Reply-To: <370CE4A4.76D43491@prescod.net>; from Paul Prescod on Thu, Apr 08, 1999 at 12:17:24PM -0500 References: <19990208104624.A4862@io.mds.rmit.edu.au> <370CE4A4.76D43491@prescod.net> Message-ID: <19990209120827.A8517@io.mds.rmit.edu.au> On Thu, Apr 08, 1999 at 12:17:24PM -0500, Paul Prescod wrote: > Marcelo Cantos wrote: > > > > ... [best of both worlds] ... You get a nice object oriented > > layer on top to talk to, and an industrial strength, robust > > repository underneath. > > > > Your comments give me the impression that this is unacceptable to > > you in the XML/heirarchical universe. You don't want DOM at any > > level. You insist on going straight to objects. It is not even > > good enough to build an object layer on top of the DOM layer. I > > find this a little implausible and hence am certain that you had > > something else in mind. Is it rather that you simply don't care > > what the underlying API is, that you are only interested in what > > happens at the object level? > > If I had evidence that a bottom-level XML/"DOM" layer would "buy me" > an industrial strength, robust repository then I would go for it. As > you have pointed out, I can cover up the ugliness with objects. But > to me, an industrial strength, robust repository implies > sophisticated tree-smart *and* link-smart ad hoc query support. The > DOM isn't a query language and doesn't (AFAIK) have a query > interface. It might be okay as an API to the results of a query but > even there I'm leery... I agree with all this. If you're dealing with objects, go with OODB. I think, however, that the situation is far less clear when we are dealing with pure data structures as opposed to first-class objects with behaviour. When it comes to maintaining and querying a large database of _data_ (not objects), I believe a text retrieval engine will generally outperform an object database and often by several orders of magnitude (witness Eliot Kimber's anecdotal post). If scalability and performance are an issue (and, judging by recent discussions, they often are) then text retrieval technology becomes much more attractive. Object databases excel in the area of expressiveness which enables them to support much more complex queries than we can. At present, our product (SIM) doesn't support ad hoc queries. It is more like a relational database in that you define fields, which can be physical fields or calculated fields (this means we support arbitrarily complex structure, but have to decide in advance which set of queries to support, a compromise that has kept our customers happy so far). We are, however, looking at full structure queries in the near future. So while the IR community is closing the gap in the area of expressiveness, I wonder if the Object community can catch up in the area of performance (or maybe it's already there and I just don't know it). > Since trees can be built as a special case of links, I tend to look > for such a beast to come out of the OO world (where links are > usually primary) instead of the text processing world (where the > tree is usually primary). Maybe you guys at rmit.edu can surprise > me though. We certainly hope so. Our customers constantly praise the performance of SIM. However, we definitely see a strong need to beef our product up in the standards area. We are looking into support for XQL and DOM (we have the framework to incorporate both without too much effort. In fact DOM is almost in since it is quite similar to our existing model. XQL is somewhat more effort, but the path indexing required to support multi-gigabyte queries would require little effort--the hard part is query evaluation and, more importantly, optimisation). > But note that a DOM-on-the-bottom is the opposite of the > architecture that I am speaking out against. I'm concerned about > people who want to layer the DOM on "top" of things that do not look > substantially like XML. In that case you are covering up an > optimized, purpose-built abstaction with a homogenized "dumb tree" > layer. That's a step backwards. Note that even the DOM creators do > not view an XML-DOM as a "universal tree API." That's why there are > several variants of the DOM -- for XML, HTML, CSS etc. I must conclude from this that we have little to disagree about in terms of the uses for DOM. I had misunderstood you to mean that DOM is _never_ appropriate for the bottom layer, and I, coming from the document repository universe, would have disagreed. Having said that, however, we tend to view DOM more as a box ticking exercise, since it doesn't really give SIM anything it doesn't already have, albeit in a non-standard way. My views on Object databases are ambivalent. Their highly expressive nature seems unfortunately coupled with poor performance. However, my opinion may be skewed by the very few attempts I've personally seen at piggy-backing a text retrieval engine on an Object database (or, for that matter, on a relational). Cheers, Marcelo Cantos -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Tue Feb 9 01:13:07 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:54 2004 Subject: "Namespaces in XML" idea? References: <3.0.32.19990208145800.00b99390@pop.intergate.bc.ca> Message-ID: <36BF8AC0.EFDF60BA@infinet.com> Tim Bray wrote: > At 05:06 PM 2/8/99 -0500, David Megginson wrote: > >In response to several private requests, I've recast my posting into a > >simple HTML page at the following location: > > David, this is a terrific piece of work. One of your answers makes > me nervous: > > [Q] What is the name of the third attribute in the example above? > [A] The name is xmlns:z from the XML 1.0 perspective; from the Namespaces > perspective, this attribute is a declaration. > > Somehow the part beginning "... from the Namespaces perspective" feels > a bit like the question is being dodged: I'm spinning my wheels a bit > in trying to suggest something better. The point that's not being made > is that in a namespace-aware system, the xmlns:z attribute is > possibly just not there. For example (I just checked) if you use > expat and enable namespace processing, you just don't see the > xmlns attributes - I don't think this is compulsory, but it also > isn't surprising. I'll try to think this over some more, but somehow > > [Q] What's it's name? > [A] It's a declaration. > > doesn't quite feel satisfying. -Tim It would be nice if and when the namespaces recommendation is updated if it mas made a formality that a conforming namespace aware processor strips namespace nodes from the document model before presenting the parsed content to the application. Even though I don't think namespace declarations should be attributes in the first place (I think a PI acting as a pseudo-element would be a better choice for a lot of reasons I won't delve into right now), I think this would make dealing with namespaces a lot easier in XML. Another idea I recently had is to spit up XML into two types: XML 1.0 XML with namespaces. As a convention, XML files without namespaces would have a suffix of "xml" and a mime content-type of text/xml or application/xml. XML files with namespaces would have a suffix of "nxml" and a mime content-type of text/nxml or application/nxml. This would allow applications to know what they are dealing with in the first place. In terms of XSL, it would be nice because it would allow +90% of the web to use XML 1.0 and not have to worry about namespaces and for the 10% who need namespaces, they can use NXML. If this was made into a formality, I would not have any reason to complain anymore as the users I am targeting could use plain old XML and for people who have applications that need to use namespaces, they can support XML. In this sense "Namespaces in XML" would be defined as an extension of XML 1.0. We could keep the DOM the same as is and create an NDOM (or something like that) which handles the "Namespaces in XML" processing model natively. Last but not least, you could have SAX and NSAX. We would not have to worry about screwing too much with the core SAX interfaces that everyone seems happy with, yet be able to define some new interfaces for SAX which are more suitable to the namespaces processing model. Would this be a terrible compromise? I think this might be the sort of compromise that would make everyone happy here as it would: - Not require changes to XML 1.0 - Not force XML architectures to handle the namespaces processing model if they don't want to - Allow those who need XML namespaces to use them at their discretion - Provide a framework for building extensions to XML in the future - Not require recommendations and standards such as the DOM and SAX to radically change, but rather to allow them to be extended or redefined with a clean set of interfaces. If namespaces take off (personally I doubt this would ever happen), then users can gracefully adandon support for XML 1.0 in their applications if they choose to. If namespaces never catch on, then no one is hurt by having to deal with namespaces. I think a clean separation of XML 1.0 and "Namespaces in XML" is the only way to bring concensus to this issue as well as provide two clean frameworks for application developers to use for different needs. Plain and simple, it seems like XML 1.0 and "Namespaces in XML" is like comparing apples to oranges. Yah they are both fruits and taste sweet, but some people prefer apple juice to orange juice, but few people like apple juice mixed with orange juice. Is this too radical an idea for the W3C to pursue? Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Tue Feb 9 01:13:37 1999 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:08:54 2004 Subject: "Clean Specs" Message-ID: <00c001be53c8$97915000$08addccf@ix.netcom.com> This may be a little off topic, but here is something I wrote for a magazine some time ago. I hasten to add that this draws on several years experience and bears no relation what so ever to any specs I may have edited for the W3C. "How specifications get written" ================================ "This is a true story, only the facts have been changed to protect the inocent" Axiom 1. "A camel is a horse designed by a committee" Axiom 2. "Those who can do, those who can't write specs, and the remainder criticise them." 20 people sit in a room. After a lot of discussion, some of it pretty heated something approaching a consensus emerges. The chair man has already seperated 'Bob' and 'Matt' who at one time appeared to be on the verge of coming to 'fisticuffs', and they both agree that they can live with the current compromise. It should be noted that 'Fred' has been quiet throughout the whole of the meeting. This is put into a document by Karl who is fairly experienced at writing specs. At this stage it is a coherent, logical, well-written document. The deadline to go public is 10 days hence. The document is posted for internal review. d-day minus 10 to d-day minus 5 Not a single suggestion or comment is made. d-day minus 5 to d-day minus 2 Several people suggest minor alterations mainly of a textual nature. These Karl incorporates into the draft. d-day minus 2 Fred posts a long rambling e-mail to the mailing list which can be fairly described as a mixture of gobbledy-gook and vitriol. He re-opens all the old wounds as well as making three suggestions described by him as 'vital' that have never been mentioned before. d-day minus 1 Bob, and Matt post rancarous postings saying how they completely agree with Fred and disagre with each other. They appear to have entirely reversed their positions from the previous meeting. It is obvious to the Chair that a tele-conference needs to take place, so this is organized for the next day. The tele-conference takes place lasting an hour and a half. No body gets any where, the Chair makes a whole series of executive decisions. Karl is asked to incorporate every ones suggestions, plus the 'executive decisions' into the draft. In the last 30 seconds of the tele-conference he tries to clarify what every ones position is. He posts a revised draft, uncomfortably aware that it is no longer the co-herent document that it once was. Karl is immediately innundated with outraged e-mails from the cheif protagonists, Bob, Matt, and Fred asking him why he has deliberately distorted their views. He also gets pointed questions from the rest of the group asking on what authority he has altered material that had already been firmly agreed on by the whole group. This is just before the weekend. Karl having spent most of his free time in the last month trying to get the spec in order has been threatened by his wife that unless he takes her away for the W.E, she will strongly consider divorcing him. Karl passes the buck to the Chair.The Chair and the Staffer, neither of whom are technical writers, cobble something together as best they can by blurring over the more controversial parts of the spec, and adding all the comments from the most vocal of the protesters. This is then published as a 'working document'. ----- Original Message ----- From: Marcus Carr <mrc@allette.com.au> To: <xml-dev@ic.ac.uk> Sent: Monday, February 08, 1999 6:13 PM Subject: Re: "Clean Specs" > >Lauren Wood wrote: > >> On 8 Feb 99, at 9:14, James Robertson wrote: >> >> > Well, has anyone considered employing real, professional technical >> > authors to write the specifications? >> >> As chair of the DOM WG, I (and I think the editors of the specs) >> would be overjoyed were someone to volunteer the services of a >> real, professional technical author who could help in the process of >> getting good specs out the door. However, as has been pointed out >> by others on this list, this support is difficult to find, as W3C >> seldom has these resources available. > >Should we assume from this that TBL's day ends with emptying the rubbish bins and vacuuming >the office? James' suggestion is right on the money - this shouldn't be considered to be a >luxury, it should be an integral part of the process. > >To me, this whole debate taking on the feel of a programmer being thrown to the mercy of the >client while the project manager stands by watching. Even if Tim Bray did have doubts about >the namespace recommendation (and I have no reason to believe that this is the case), he (like >anyone else) feels that he has no option but to defend what we consider to be at least partly >"his work". We shouldn't be snarling at Tim and he shouldn't be snarling back - the whole >process should be elevated to discussion between the wider community (the client) and the W3C. >Obviously Tim's input would be valued (as would that of the programmer), but any >dissatisfaction needs to be directed at the organisation. I'm hardly an expert on the workings >of the W3C, but it appears that this form of interface doesn't exist. > >The developer community aren't the only victims of the W3C process - it appears that the lack >of support results in those who write the recommendations also joining this hallowed group. If >the W3C wants to retain ownership (or whatever you want to call it) of the initiatives, then >they need to provide support for the creators and accept responsibility for the results. This >would result in a) discussions related to phrasing and construction of the recommendation >being directed to one trained in the creation of spec documents, and b) technical discussion >being directed to an organisation, not an individual who might (theoretically) have difficulty >abstracting best practice from satisfaction in the finished product. > >Where's the project manager? > > >-- >Regards, > >Marcus Carr email: mrc@allette.com.au >___________________________________________________________________ >Allette Systems (Australia) www: http://www.allette.com.au >___________________________________________________________________ >"Everything should be made as simple as possible, but not simpler." > - Einstein > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Tue Feb 9 01:20:49 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:08:54 2004 Subject: Fw: "Clean Specs" Message-ID: <001d01be53ca$177008b0$2ee044c6@arcot-main> >Can anyone within the W3C verify that there are there any plans for taking DOM Level 1 past the >current recommendation. DOM Level 1 did not exactly have a 1.0 status as far as I remember. Tyler, Future work on DOM Level 1 will probably be limited to the Errata section. This means that the namespace issues will have to be dealt with in DOM Level 2. I would like to suggest that you bring up the DOM-related namespace issues in the DOM mailing list so that we can focus on it specifically from the DOM perspective. Best, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Tue Feb 9 01:38:18 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:08:55 2004 Subject: URL: 19 Short Questions about Namespaces (with Answers) References: <3.0.32.19990208145800.00b99390@pop.intergate.bc.ca> Message-ID: <36BF8404.260AF76A@jclark.com> Tim Bray wrote: > > At 05:06 PM 2/8/99 -0500, David Megginson wrote: > >In response to several private requests, I've recast my posting into a > >simple HTML page at the following location: > > David, this is a terrific piece of work. One of your answers makes > me nervous: > > [Q] What is the name of the third attribute in the example above? > [A] The name is xmlns:z from the XML 1.0 perspective; from the Namespaces > perspective, this attribute is a declaration. > > Somehow the part beginning "... from the Namespaces perspective" feels > a bit like the question is being dodged: I'm spinning my wheels a bit > in trying to suggest something better. The point that's not being made > is that in a namespace-aware system, the xmlns:z attribute is > possibly just not there. For example (I just checked) if you use > expat and enable namespace processing, you just don't see the > xmlns attributes expat should probably have a start/endNamespaceDeclaration callback: some applications (such as XSL) need to know what declarations are in effect. > - I don't think this is compulsory, but it also > isn't surprising. I'll try to think this over some more, but somehow > > [Q] What's it's name? > [A] It's a declaration. > > doesn't quite feel satisfying. Seems OK to me. Compare: [Q] What is the target name in <?xml version="1.0"?> processing instruction? [A] From the XML 1.0 perspective, it's not a processing instruction, it's a declaration. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Tue Feb 9 02:34:05 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:08:55 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) In-Reply-To: <370CF210.759B26AB@prescod.net> Message-ID: <000901be53d3$fd211080$d3228018@jabr.ne.mediaone.net> Paul Prescod wrote: > > > "Borden, Jonathan" wrote: > if the data is tabular then use a recordset. in the > specific cases when > > 1) we are storing data which is naturally hierarchical. 2) when the data > > needs to interface with systems which for other reasons employ DOM > > interfaces > > Okay. We can probably all agree with this. If you have software that is > expecting a DOM and you need to connect it to data that is not XML, you > need to build a DOM interface. This is a different point of view from > those who say: "let's build new client software using only the DOM served > by data with only a DOM interface. The fact that the DOM is standardized > will just make all of my interoperability problems go away." No way. If > your client software and your server software had an impedence mismatch, > slapping a DOM interface on both sides makes it *worse* not better. Yes we can agree (wonders never cease :-) hope you don't mind my dragging this debate out, but I believe it is an important issue. Some estimates suggest that corporate programmers spend up to 60-70% of their time dealing with database interface issues. > > > e.g. my XSL processor us built on a DOM interface and I wish to > > query the database using XQL (which happens to be built into my XSL > > processor in this example), it is more convenient to interface > to the data > > using DOM interfaces than it is using recordsets (i.e. tabular data). > > It's more convenient but it's probably going to run as slow as hell. > Nobody implements SQL or OQL on top of an industry-standard interface. > They put it right in the core engine of their database. Performance will depend on implementation. As SQL or OQL is in the core of the database engine, XQL can also be placed. Query optimization strategies apply. For example Oracle 8i is reportedly placing an XML DOM parser in the database engine. What this will mean for performance I don't know but I suspect that the XML people at Oracle are considering these issues. XQL being a Microsoft et al. proposal suggests that Microsoft is also considering these issues. I would hazard that IBM is considering this as well. I suppose we will at some point see what they come up with. If it works well, I'm correct, if not you're correct. > > > Arguably, when using an ODBMS this example would be > more straightforward > > (but you picked RDBMS). The problem is that there is no > standard, language > > independent interface onto ODBMS's. > > ********** Yes there is! ************* > Whoops! I miswrote. I mean to say, there is no standard, *widely implemented*, language independent interface to ODBMS's. Certainly OQL and ODMG standards exist. However, when I've looked at ODBM's they require vendor specific interfaces/code. In contrast, I can generally plug in an ODBC driver and switch from Oracle, to DB2 to SQL Server (etc), without changes to the client code. This is a true interface. True the syntax of OQL is standardized, but the mechanism of interaction with the ODBM's is generally vendor dependent. That is to say, SQL is the language standard but ODBC (or JDBC) is the binary interface. What I want is a simple way for a server based script (e.g. JavaScript,Python) to interface with a database. ADO and OLE-DB are an effective recordset object layer which allow access to tabular data from server scripts (e.g. ASP). The use of the SHAPE command allows hierarchical recordsets. These recordsets can be transmitted back to the client via a custom binary marshalling interface or XML with just a tad of work. How can I do this using OODBM's and OQL (last I checked there is no scripting interface)? This is a basic e-Commerce function. > > > For example, I get to say (using 'extended DOM'): > > > > NodeList anotherSet = > airplanes.selectNodes("airplane[@color='red' and > > .//screw/thread/@pitch = 64]"); > > > > to select all red airplanes with screws having a pitch=64... > > The DOM is doing essentially nothing here. This imaginery XML query > language is doing all of the work. But even the XML query language is > going to make solving your problem harder than OQL would. For instance OQL > can be statically type checked. XQL cannot, in general, for many subtle > reasons. OQL can handle mathematical range constraints. OQL has a concept > of a "stored query" that allows some level of abstraction. OQL has "local > variables" also for abstraction. I wish to make a distinction between DOM Level 1 which IMHO is lacking in some notable features particularly in the ability to generate trees etc. Vendor specific extensions, notable Microsoft's include the ability to run XSL queries against DOM trees. I employ this as a lightweight browser based local, in-memory database (scriptable). In IE5b2 "selectNodes" is part of the IXMLDOMElement interface (a M.S. extension), not the IDOMElement interface. > > I don't completely follow your examples: > > > XMOP for example > (http://jabr.ne.mediaone.net/documents/xmop.htm) is a way > > to serialize arbitrary COM objects using their typeinfo > metadata. XMOP is a > > layer that can persist objects into either a) a stream > (serialization) b) > > direct-to-DOM. When I attempted to design a direct-to-Recordset > persistence > > interface on XMOP I found that I had to essentially develop a > > DOM<->Relational mapping. This is because arbitrary objects can > be modelled > > in a hierarchical fashion (e.g. serialized to XML). > > This seems like a serialization problem. We all agree that XML is great > for serialization. If your only goal was to get the data into a "database > of some kind" then an OO database would have been easier than an XML > database. Yes a serialization problem and an example of how XML can be used to represent arbitrary object data. COM unlike CORBA does not provide an automatic persistence function. XMOP is a persistence interface for COM objects which does not require the object to know about serialization (it need only have a typelibrary). > > > In another example, using the medical imaging DICOM > protocol (a complex > > property based protocol) I have developed a mapping to the Microsoft > > PropertySet format (used with Index Server). This mapping is > not clean (at > > all given the inability to represent certain DICOM structures as > > PROPVARIANTs). This causes similar problems in mapping the protocol to a > > relational database (the workaround is to use binary data). > Using XML and > > the DOM was a piece of cake to solve this difficult problem. > > I'm not at all clear on how the DOM solved this impedence mismatch. Just to point out that the XML (and DOM) data model can solve certain problems which aren't trivially solved in the relational mode. No big deal. This is also easily solved by OODBM's. > Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Tue Feb 9 02:56:44 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:08:55 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) In-Reply-To: Your message of "Thu, 08 Apr 1999 12:32:39 EST." <370CE837.DB8E5EFC@prescod.net> Message-ID: <199902090258.TAA03761@malatesta.local> > > I don't think it makes sense to build a business-object model on top of DOM, > > but I do think it makes sense to define an exchange protocol that selializes > > objects to XML representations using DOM as a programmatic interface. > > I agree. I'll point out, however, that it is REALLY EASY to generate XML > directly. In your opinion does the DOM actually make it easier? > > If you use a "reverse SAX" interface (instead of a DOM-building interface) > then you could pipe together data consumers and if any of them ever needed > a DOM, it could build it. I think it depends on several things: 1) The language in which you're implementing the serialization. In Python, with its rich string handling and dynamic programming features, I might prefer to generate XML directly, but in Java or C++, I might prefer to go through DOM. 2) Your environment. I have often ended up using DOM because I'm working in a distributed environment, using CORBA, and it makes it very easy and natural to just call DOM interfaces across the ORB as a sort of serialization. 4DOM actually came about because we already had a CORBA-ready API for manipulating HTML-based views of an object across an ORB, and we wanted to expand this so we could take advantage of XML. The W3C's work provided a natural spring-board. Of course, on the same machine, it's probably easier and faster to use an events-based approach (SAX and your "reverse SAX", something like which I remember having cobbled together, in fact). 3) Your object structure. Some of the advocates of DOM as the universal object-model note that they are working in domains that fall into natural tree structures. In this case, there is clearly less impedance mis-match in using the DOM interface, and one can even smartly use the Builder pattern to connect abstraction to serialization interface if he's _very_ confident in the quality of the design. Alas the reality is that such natural tree-representations are not as common in real-life as some would have us believe. Business object models, as you rightly pointed out, more often take on the pattern of a graph (bi-directional, cyclic, and all those other tree-killers). > > I think it also makes sense to use the DOM to develop a user-interface layer > > for such objects, possibly using the same WDDX or XML-RPC mappings in > > association with a set of style-sheets (although this is just one of many > > possible mechanisms). > > Yes, it makes sense to use XML as an "interchange language" between your > business objects and your user interface. On the other hand, if that > interface is meant to be editable the information loss associated with > "dumbing down" to XML may not be acceptable. I agree with this, but in my opinion, user-interface hasn't caught up with object-modeling practice in any case. XML does cause "dumbing-down", but not much more than other user-interface options. How does one go from a typical object model, with as many degrees of freedom as most object models entail, to presenting a linear form as an effective editing interface? I don't claim to have any visionary ideas here, but I get the sense that the next big breakthrough (or hype-engine) in the object community will have to be at the fundamental UI level. Or did it already pass us by in the (over-wrought) forms of OpenDOC or Pink? -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Tue Feb 9 03:50:16 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:08:55 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) In-Reply-To: <199902090258.TAA03761@malatesta.local> Message-ID: <000b01be53de$9f4d21f0$d3228018@jabr.ne.mediaone.net> Uche Ogbuji wrote: first: > > > > > I don't think it makes sense to build a business-object model > on top of DOM, > and then: > I have often ended up using DOM because I'm > working in a > distributed environment, using CORBA, and it makes it very easy > and natural to > just call DOM interfaces across the ORB I too work in a distributed environment, until the last year in DCOM but increasingly using HTTP. In the multi-tier model, the client communicates with the middleware or business-object tier using an RPC or distributed object protocol. For the sake of discussion, lets assume that the DCOM wire protocol and CORBA IIOP have roughly similar functionality as object RPC protocols. Good practices in such systems dictate that round-trips between the client and business object layers be minimized (in fact round trips between tiers in general should be minimized except that when objects exist in the same process, or memory space, an object call as efficient as other in-process calls. (e.g. in C++ this is a vtable call, the difference in Java is the difference between a normal Java method invocation and an RMI invocation). Failure to mininize round trips is often the single biggest performance drain on such systems otherwise well designed. This is a tenet of distributed object systems design. While I have advocated judicious application of the DOM interfaces in object systems, the DOM is best not employed as a business object itself (i.e. the client ought not communicate directly with a distributed DOM), for reasons outlined by Paul Prescod, as well as the fact that this use of the DOM will *maximize* client-server roundtrips. For example a when a NodeList is returned from a distributed DOM call, the client obtains a *proxy* to the NodeList, and iteration through this proxy results in a server roundtrip for each call. It is far more efficient to select a document or document fragment using a distributed call and return this as XML directly. The returned XML can be parsed and iterated on using local client calls. My studies on files as large as 100 Mb demonstrate that it is usually more efficient to download the entire file to the client for processing than it is to iterate over a file using distributed object calls. By defining a document fragment using a distributed call, and returning the document or fragment in serial fashion (i.e. XML) and locally processing, performance may increase by an order of magnitude. > > > > Yes, it makes sense to use XML as an "interchange language" between your > > business objects and your user interface. On the other hand, if that > > interface is meant to be editable the information loss associated with > > "dumbing down" to XML may not be acceptable. > Another way of saying this. Extra work to smarten up the XML interchange format is well worth it (the problem is not with XML as an interchange format). Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lauren at sqwest.bc.ca Tue Feb 9 04:46:07 1999 From: lauren at sqwest.bc.ca (Lauren Wood) Date: Mon Jun 7 17:08:55 2004 Subject: Fw: "Clean Specs" In-Reply-To: <36BF5664.C0100DC5@infinet.com> Message-ID: <199902090445.UAA09620@sqwest.bc.ca> On 8 Feb 99, at 16:25, Tyler Baker wrote: > Can anyone within the W3C verify that there are there any plans for taking > DOM Level 1 past the current recommendation. DOM Level 1 did not exactly > have a 1.0 status as far as I remember. I chair the DOM WG, so I guess that qualifies me to answer this ;-) There wil be a DOM Level 2 (note we're not currently working on a version 2.0, i.e. we're not revising DOM Level 1, we're adding functionality). For more details on what this might mean, see the W3C DOM pages at http://www.w3.org/DOM or subscribe to the public DOM mailing list (instructions on the DOM pages). cheers, Lauren xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lauren at sqwest.bc.ca Tue Feb 9 04:46:07 1999 From: lauren at sqwest.bc.ca (Lauren Wood) Date: Mon Jun 7 17:08:55 2004 Subject: "Clean Specs" In-Reply-To: <F7E1775C1C27D211881F00A024B2853046A01E@CESS01AMX03> Message-ID: <199902090445.UAA09618@sqwest.bc.ca> On 8 Feb 99, at 16:35, Buss, Jason A wrote: > Maybe it is time some of us who have been "put off" by the way the > Namespaces recommendation to offer our services, under the auspices of the > WG for XML and XML related standards, to go through and annotate the > drafts and recommendations, as they come up for the vote. The DOM WG would really appreciate anyone who took the time to go through the specs as they're prepared and point out inconsistencies, misleading wordings or errors, particularly if they can propose better wording or say what alternative meanings can be read into the prose we have. DOM Level 1 benefited a lot from those few people who did go through the drafts and say when we were missing exceptions, or when we needed to add something, or when the prose wasn't easy to understand, and I would like to see this continue. This is why W3C specs always have an email address to send comments to; the more detailed the comments the more it helps us fix the specs. I'd suggest sending concrete comments on the DOM specs to the public DOM mailing list rather than this one (www-dom@w3.org; to suscribe send email to www-dom- request@w3.org with the subject subscribe). And yes, the email is read, even if it isn't always answered directly. DOM Level 1 changed a lot because of public feedback and comments. cheers, Lauren xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jackpark at thinkalong.com Tue Feb 9 04:50:44 1999 From: jackpark at thinkalong.com (Jack Park) Date: Mon Jun 7 17:08:55 2004 Subject: OS-XML was: Re: CORBA's not boring yet. / XML in an OS? In-Reply-To: <36BEC149.AB389419@darmstadt.gmd.de> References: <0d2701be5270$824dbca0$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: <Version.32.19990208194456.00fc1430@thinkalong.com> Now, THIS seems worth talking about, even trying. Jack Park At 11:49 AM 2/8/99 +0100, you wrote: >James Tauber wrote: > >> >Anyhow, this naturally makes me wonder - could XML and related ideas >> >like XSL have a place in an operating system? Where would they fit in? >> >KDE and Gnome could be great playgrounds for trying something like this >> >out. >> >> For a while now, I've been thinking what an OS (or more likely shell) would >> look like if it took Unix's "everything as a file" to "everything as an XML >> element". > >Now this is interesting. > >> >> >> A system would be a single XML "uberdocument"... Applications ... would operate on other >> nodes in the element tree. >> >> There would be an application, for example, that got mail via POP or IMAP, >> represented it in XML and then attached it a particular point in the >> uberdocument. XSL could be used to sort the mail. XSL would also be used to >> view the mail. >> > >Great ideas. I can see that this would just follow the unix philosophy, and could actually be >useful: Just like how today, anyone can use the unix command line tools to pipe together >small apps to form "new" app/filters, in this XML/OS, someone could use XML/XSL parsers/apps >to connect and filter XML to create new apps and filters. > >The myriad of programs that operate and manipulate XML could manipulate any OS object, program >or data. For example, the IBM Alphaworks "Tree Diff" (a Java program that generates "diff" >info between two XML documents) can be applied to anything stored in the OS, in the same way >that the conventional diff can operate on any text file. > >> >> It's XML for the sake of it, but I think it would be fun to try out. >> > >Absolutely, but I'd also bet that some convincing arguments could be made for real advantages >of this. > >- Robb > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jackpark at thinkalong.com Tue Feb 9 04:50:50 1999 From: jackpark at thinkalong.com (Jack Park) Date: Mon Jun 7 17:08:55 2004 Subject: "Clean Specs" In-Reply-To: <36BE830B.CBBE6552@infinet.com> References: <3.0.32.19990207205824.00bed660@pop.intergate.bc.ca> Message-ID: <E10A55i-0005Ya-00@punch.ic.ac.uk> I confess, I am learning a lot, or perhaps becoming vastly more confused than ever. Not sure which. I cannot internalize all the venom regarding namespaces. You see, I'm pondering a task that I suspect is a lot larger than modeling business objects. I'm vastly more interested in modeling living objects, and using XML in the process. In the business world, it is reasonable to see an enterprise as a feedback network of interactive, and interacting objects. If I were to model the enterprise, not just, say, accounts receivable, I would be forced to model a variety of levels. I happen to imagine that the use of namespaces would make that a lot easier. I also happen to imagine that I'll need something a damn sight stronger than just an individual tree. Probably something along the lines of a massive relational graph, with each node itself a tree with nodes some of which are yet other relational graphs. The occasional mention of groves on this list prompts me to wonder if XML shouldn't include something like that, along with trees. Consider the notion of modeling a living cell (an enterprise unto itself :-) Starting out, I'd need a biological namespace, perhaps one dealing with cellular things. Then, there's the molecular level (already being done in XML), and below that, the chemical reactions and all that, perhaps Peter's CML. Again, when I think like that, it's hard to accept the venom. Is my line of thinking worth persuing on this list? Cheers, Jack Park At 01:24 AM 2/8/99 -0500, you wrote: >Tim Bray wrote: > >> At 11:25 PM 2/7/99 -0500, Murray Maloney wrote: >> >I can claim that it is a ramshackle compromise because I was >> >witness to its creation. The process stunk to high heaven. >> >The result is an awful compromise, and not because I don't >> >like it. >> >> In fact, Murray disagrees so strongly with what the spec *says* >> (often, and on the record) that he is probably not the best judge >> of how well it says it. -Tim > >Well who is the best judge then? I thought that standards bodies were largely in existence to >promote concensus on matters which companies and organizations disagree upon. Rather than >bring everyone together, this entire "Namespaces in XML" recommendation has splintered the >entire XML community. By that fact alone, the W3C is not doing a good job as a standards body >for the internet. > >I am a forgiving person when it comes to making one, maybe two complete blunders (such as the >case with "Namespaces in XML"), but many people are not as forgiving as I. Most of these >people don't post to this list or even subscribe to it. They would just look at "Namespaces >in XML" and then quietly go back to their current vendor specific solution for their >web-publishing and e-commerce needs and forget the draft ever existed. The same goes for >recommendations like XSL which are polluted with "Namespaces in XML" as well. They are the >real "silent majority" that the W3C seems to have complete disdain for. > >The simple truth is that if the W3C does not behave more sensitive to criticism in the future >and conduct itself in a more utilitarian manner, or at least make a change in the leadership >of the organization, people like me and many others will clamor for creating another internet >standards body that is not as slow in adopting standards as ISO or ANSI, but is not as >obstinate as the W3C. > >Of course "Namespaces in XML" has to me been the only super-major screwup in the W3C's short >life as a budding internet standards organization. I guess the question now is whether or not >the W3C and its members have the courage to make the necessary changes. > >Tyler > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From larsga at ifi.uio.no Tue Feb 9 08:31:36 1999 From: larsga at ifi.uio.no (Lars Marius Garshol) Date: Mon Jun 7 17:08:55 2004 Subject: "Namespaces in XML" idea? In-Reply-To: <36BF8AC0.EFDF60BA@infinet.com> References: <3.0.32.19990208145800.00b99390@pop.intergate.bc.ca> <36BF8AC0.EFDF60BA@infinet.com> Message-ID: <wkr9s07zcx.fsf@ifi.uio.no> * Tyler Baker | | It would be nice if and when the namespaces recommendation is | updated if it mas made a formality that a conforming namespace aware | processor strips namespace nodes from the document model before | presenting the parsed content to the application. Do you think this is really necessary? I think in some cases it would be desirable to see them, and in most cases not. So even if this change is made I think it should be "at user option". Anyway, it's a rather obvious user option, so methinks there isn't all that much need to mention it in the spec. The xmlproc namespace code currently has an option that lets the user decide whether the declaration attribute should be reported or not. --Lars M. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Tue Feb 9 09:27:27 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:08:56 2004 Subject: "Namespaces in XML" idea? Message-ID: <01BE5415.DE5F3360@grappa.ito.tu-darmstadt.de> Lars Marius Garshol wrote: > | It would be nice if and when the namespaces recommendation is > | updated if it mas made a formality that a conforming namespace aware > | processor strips namespace nodes from the document model before > | presenting the parsed content to the application. > > Do you think this is really necessary? I think in some cases it would > be desirable to see them, and in most cases not. So even if this > change is made I think it should be "at user option". > > Anyway, it's a rather obvious user option, so methinks there isn't all > that much need to mention it in the spec. The xmlproc namespace code > currently has an option that lets the user decide whether the > declaration attribute should be reported or not. This is exactly the kind of thing that should be mentioned in this or any other spec -- one person's obvious is another person's obscure and the point of a spec is to sort it all out. By the way, I was a little surprised to read that expat does not pass xmlns attributes to the application. I had always assumed that attributes were part of the document's data and therefore the processor was required to pass them to the application. On re-reading the XML spec, I am not so sure. Section 2.4 defines markup as "start-tags, end-tags, empty-element tags, ..." Are attributes part of start-tags? If so, this would certainly influence the question of whether to use elements or attributes, although I find it hard to believe any parser wouldn't pass attributes to the application. If not, why isn't expat's behavior illegal? -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Tue Feb 9 09:43:08 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:08:56 2004 Subject: "Namespaces in XML" idea? In-Reply-To: <01BE5415.DE5F3360@grappa.ito.tu-darmstadt.de> Message-ID: <Pine.GHP.4.02A.9902090936260.17997-100000@mail.ilrt.bris.ac.uk> On Tue, 9 Feb 1999, Ronald Bourret wrote: > Lars Marius Garshol wrote: > > By the way, I was a little surprised to read that expat does not pass xmlns > attributes to the application. I had always assumed that attributes were > part of the document's data and therefore the processor was required to > pass them to the application. [...] A pragmatic reason for passing on the xmlns:xyz information is that some applications might want to go beyond what 'Namespaces in XML' itself provides and use similarly abbreviated values _inside_ data attributes instead of full URIs (eg. for data types), and make use of the XMLNS info when doing so. eg. <ABC dt:dt="http://xmlschemas.org/useful/datatypes/Float>22.2232</ABC> versus <ABC dt:dt="USEFUL:Float">22.2232</ABC> (where USEFUL has an xmlns:USEFUL declaration somewhere) By cutting out this information at such a low level, applications won't be able to consider doing this sort of thing. Dan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Feb 9 12:38:32 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:56 2004 Subject: "Namespaces in XML" idea? In-Reply-To: <Pine.GHP.4.02A.9902090936260.17997-100000@mail.ilrt.bris.ac.uk> References: <01BE5415.DE5F3360@grappa.ito.tu-darmstadt.de> <Pine.GHP.4.02A.9902090936260.17997-100000@mail.ilrt.bris.ac.uk> Message-ID: <14016.9017.974839.985439@localhost.localdomain> Dan Brickley writes: > > By the way, I was a little surprised to read that expat does not > > pass xmlns attributes to the application. I had always assumed > > that attributes were part of the document's data and therefore > > the processor was required to pass them to the application. > > [...] > > A pragmatic reason for passing on the xmlns:xyz information is that some > applications might want to go beyond what 'Namespaces in XML' itself > provides and use similarly abbreviated values _inside_ data attributes > instead of full URIs (eg. for data types), and make use of the XMLNS > info when doing so. The easiest way to resolve this question would be to decide that 1. Namespace declarations do not appear as attributes in the Namespaces view; and 2. some applications may require information from both the XML 1.0 and the Namespaces view concurrently. Note that neither is normative -- I'm proposing them only as rules of thumb. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Tue Feb 9 12:42:23 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:08:56 2004 Subject: "Namespaces in XML" idea? References: <01BE5415.DE5F3360@grappa.ito.tu-darmstadt.de> Message-ID: <36C028C9.DFBA6C14@jclark.com> Ronald Bourret wrote: > I was a little surprised to read that expat does not pass xmlns > attributes to the application. If you don't enable namespace processing, then expat of course does parse xmlns attributes to the application. However if you do enable namespace processing, which means that prefixes get expanded into URIs, then expat does indeed not pass xmlns attributes to the application firstly because from the perspective of namespaces they are declarations not attributes, and secondly because there is no URI defined to which the xmlns prefix could be expanded. However, it will, once I get a chance to implement it, report namespaces declarations made using xmlns attributes to the application using an additional callback. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From deke at tallent.com Tue Feb 9 14:51:22 1999 From: deke at tallent.com (Deke Smith) Date: Mon Jun 7 17:08:56 2004 Subject: HTML, XML, XML-RPC in one net app Message-ID: <1293550645-143917605@server2.tallent.com> Dave Winer, dave@userland.com said on 2/5/99 8:35 AM: >***HTML interface > >First, here's the HTML interface. > >http://www.mailtothefuture.com/ > >Please log on, get a password, create a message or two, became familiar >with how it works from a user's point of view. You'll definitely want to >have a couple of messages in your queue to try out the other examples. David, I am trying to do this sort of thing for a client of mine. Is this app part of the Nirvana release? Deke ----------------------------------------------------------------- Deke Smith Tallent Communications Group, Brentwood TN deke@tallent.com, 615-661-9878 "Somebody has to do something, and it's just incredibly pathetic that it has to be us." - Jerry Garcia (of the Grateful Dead) ----------------------------------------------------------------- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Tue Feb 9 16:03:39 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:08:56 2004 Subject: URL: 19 Short Questions about Namespaces (with Answers) In-Reply-To: <3.0.32.19990208145800.00b99390@pop.intergate.bc.ca> Message-ID: <199902091600.LAA22781@hesketh.net> At 02:58 PM 2/8/99 -0800, Tim Bray wrote: >For example (I just checked) if you use >expat and enable namespace processing, you just don't see the >xmlns attributes - I don't think this is compulsory, but it also >isn't surprising. This worries me - I'd like to think namespaces (and even their prefixes) should be able to survive a round trip through a parser and back to a file. Dropping the xmlns: attributes makes that a lot harder. Given that some documents may have to go through both namespace-aware and namespace-blind processing, that could lead to some serious pile-ups. Simon St.Laurent XML: A Primer / Building XML Applications (March) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Tue Feb 9 18:07:54 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:56 2004 Subject: Data Model Patterns (Was: Groves and Architectural Forms) References: <Pine.GSO.3.96.990209113541.10151F-100000@grind> Message-ID: <36C078B3.E70B3082@manhattanproject.com> Has anyone on the list read David Hay's book "Data Model Patterns" ? His book describes patterns that arise when modeling systems using a relational database. It is *very* good. I've been doing corporate database work for 6 years at companies like Ford, GM (transitively), Gartner Group, as well as many small corporations. Anyway, his patterns hit home hard -- the models he has published are very close to those that I came up with after painful corrections. Anyway, towards the end of the book he gets a bit philosophical and proposes a "universal pattern". It is strikingly similar to Groves. I guess this should be no suprise... :) Clark Evans xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Tue Feb 9 18:36:38 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:08:56 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) References: <19990208104624.A4862@io.mds.rmit.edu.au> <370CE4A4.76D43491@prescod.net> <19990209120827.A8517@io.mds.rmit.edu.au> Message-ID: <36C07862.EA6B692B@prescod.net> Marcelo Cantos wrote: > > Object databases excel in the area of expressiveness which enables > them to support much more complex queries than we can. At present, > our product (SIM) doesn't support ad hoc queries. It is more like a > relational database in that you define fields, which can be physical > fields or calculated fields (this means we support arbitrarily complex > structure, but have to decide in advance which set of queries to > support, a compromise that has kept our customers happy so far). But what does a query consist of? Can it cross links? Links between documents? Multiended XLinks or HyTime links? Also, you say that SIM is good at "data, not objects." By data do you mean only textual data? Could you characterize the data it is best at handling? > So while the IR community is closing the gap in the area of > expressiveness, I wonder if the Object community can catch up in the > area of performance (or maybe it's already there and I just don't know > it). I don't think ODB performance is there yet, but the text storage people aren't the only ones who need it so I expect it will be someday. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Remember, Ginger Rogers did everything that Fred Astaire did, but she did it backwards and in high heels." --Faith Whittlesey xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Tue Feb 9 20:53:54 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:56 2004 Subject: "Namespaces in XML" idea? References: <3.0.32.19990208145800.00b99390@pop.intergate.bc.ca> <36BF8AC0.EFDF60BA@infinet.com> <wkr9s07zcx.fsf@ifi.uio.no> Message-ID: <36C09E54.BD19944C@infinet.com> Lars Marius Garshol wrote: > * Tyler Baker > | > | It would be nice if and when the namespaces recommendation is > | updated if it mas made a formality that a conforming namespace aware > | processor strips namespace nodes from the document model before > | presenting the parsed content to the application. > > Do you think this is really necessary? I think in some cases it would > be desirable to see them, and in most cases not. So even if this > change is made I think it should be "at user option". At least for interfaces like SAX I think this default "user option" should be to strip namespace nodes. > Anyway, it's a rather obvious user option, so methinks there isn't all > that much need to mention it in the spec. The xmlproc namespace code > currently has an option that lets the user decide whether the > declaration attribute should be reported or not. Yah that is true. I would just argue that the default should be to strip for the sake of consistency among XML namespaces processors or else the option not to strip should be a proprietary feature of an XML namespaces processor and that formal conformance should be for namespace processors to strip namespace nodes before presenting them to the application. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Tue Feb 9 21:03:45 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:56 2004 Subject: "Namespaces in XML" idea? References: <01BE5415.DE5F3360@grappa.ito.tu-darmstadt.de> <Pine.GHP.4.02A.9902090936260.17997-100000@mail.ilrt.bris.ac.uk> <14016.9017.974839.985439@localhost.localdomain> Message-ID: <36C0A264.19CE4793@infinet.com> David Megginson wrote: > Dan Brickley writes: > > > > By the way, I was a little surprised to read that expat does not > > > pass xmlns attributes to the application. I had always assumed > > > that attributes were part of the document's data and therefore > > > the processor was required to pass them to the application. > > > [...] > > > > A pragmatic reason for passing on the xmlns:xyz information is that some > > applications might want to go beyond what 'Namespaces in XML' itself > > provides and use similarly abbreviated values _inside_ data attributes > > instead of full URIs (eg. for data types), and make use of the XMLNS > > info when doing so. > > The easiest way to resolve this question would be to decide that > > 1. Namespace declarations do not appear as attributes in the > Namespaces view; and > > 2. some applications may require information from both the XML 1.0 and > the Namespaces view concurrently. This is one of the chief reasons why maybe splitting XML 1.0 and XML 1.0 with "Namespaces in XML" into two separate beasts may not be such a bad idea. IMHO XML 1.0 without namespaces and XML 1.0 with namespaces are two totally different beasts. How you actually deal with processing XML 1.0 and XML 1.0 with "Namespaces in XML" will be a lot different at the parser level and application level (not to mention the parser interfaces you use) than I think a lot of people would like to believe. So as not to confuse end-users with having to dynamically configure the right parser interfaces to handle both plain old XML 1.0 and XML 1.0 with "Namespaces in XML". I would like to think of "Namespaces in XML" as an extension to XML 1.0 and not a "change" to XML 1.0. This way we don't have to pollute the XML 1.0 parser interfaces and application frameworks with "Namespaces in XML" (something I favor), as well as the fact that people can use clean parser interfaces and application frameworks to handle the "Namespaces in XML" flavor of XML. For the DOM iteself, the current Level 1 Recommendation would not have to be changed, rather you would just create an NDOM whose processing model is based completely on "Namespaces in XML" being the current data environment. Most XML Parser packages provide both a validating parser and a non-validating parser. What would be so bad with a non-validating namespaces parser to be added where validation would not be done by DTD's but a more powerful schema language like DDML. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Tue Feb 9 21:16:30 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:56 2004 Subject: URL: 19 Short Questions about Namespaces (with Answers) References: <199902091600.LAA22781@hesketh.net> Message-ID: <36C0A564.7E7EFEF7@infinet.com> "Simon St.Laurent" wrote: > At 02:58 PM 2/8/99 -0800, Tim Bray wrote: > >For example (I just checked) if you use > >expat and enable namespace processing, you just don't see the > >xmlns attributes - I don't think this is compulsory, but it also > >isn't surprising. > > This worries me - I'd like to think namespaces (and even their prefixes) > should be able to survive a round trip through a parser and back to a file. > Dropping the xmlns: attributes makes that a lot harder. Given that some > documents may have to go through both namespace-aware and namespace-blind > processing, that could lead to some serious pile-ups. The clean way to handle this would be to formally split XML 1.0 and XML 1.0 with "Namespaces in XML" into two totally separate recommendations (in a sense they are that way, but the "Namespaces in XML" recommendation does not say this one way or the other). This way you know what you are dealing with when you get someone else's data over the internet. Just like XML has the "standalone" value, I think it would be appropiate for either you to define an NXML content-type or at the very minimum require that in order to use namespaces, in the XML declaration of a document you need to have a value of: namespaces="yes". I still think they should be totally separated into different content-types, but this is another idea that I had for making things easier for applications to deal with "Namespaces in XML". For example: <?xml version="1.0" namespaces="yes"?> Any ideas on this here? Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Feb 9 21:33:26 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:56 2004 Subject: "Namespaces in XML" idea? In-Reply-To: <36C0A264.19CE4793@infinet.com> References: <01BE5415.DE5F3360@grappa.ito.tu-darmstadt.de> <Pine.GHP.4.02A.9902090936260.17997-100000@mail.ilrt.bris.ac.uk> <14016.9017.974839.985439@localhost.localdomain> <36C0A264.19CE4793@infinet.com> Message-ID: <14016.42840.862493.555860@localhost.localdomain> Tyler Baker writes: > I would like to think of "Namespaces in XML" as an extension to XML > 1.0 and not a "change" to XML 1.0. This way we don't have to > pollute the XML 1.0 parser interfaces and application frameworks > with "Namespaces in XML" (something I favor), as well as the fact > that people can use clean parser interfaces and application > frameworks to handle the "Namespaces in XML" flavor of XML. You think correctly -- right now, XML 1.0 is the only approved XML specification, and it does not mention namespaces (except to warn people off using the ':' character). Support for namespaces is not required for XML 1.0 conformance. Several related specs like RDF and XSL have chosen to use namespaces because they provide good solutions for otherwise-difficult problems, and the DOM will need to support namespaces so that it can support RDF and XSL (among others). Right now, no one is working on XML 1.1 or XML 2.0; when they do, however (some time in the next millenium probably), the rapid technical and market success of Namespaces so far suggests that they belong the next-generation core spec. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jatkins at Bluestone.com Tue Feb 9 22:27:58 1999 From: jatkins at Bluestone.com (Atkins, Jon) Date: Mon Jun 7 17:08:56 2004 Subject: Bluestone XML Product Family Message-ID: <9A4DF69E3C5ED211B86400A0C9D17760150E03@thor.operations.bluestone.com> Greetings: Please find the following information on Bluestone Software's new family of XML products: XwingML (freeware), XML-Server and Visual-XML. Bluestone's XML Product Family provides you with the tools to facilitate application-to-application and business-to-business communication, today. For more information, please visit our Website at http://www.bluestone.com/xml. XwingML, available for FREE, this innovative development platform for merging XML and Java code allows users to create (in common English) XML documents that generate Java Swing classes to create graphical user interfaces. Download a copy from http://www.bluestone.com/xml Bluestone XML-Server is the first generally available Dynamic XMLServer. Companies that want to take advantage of business-to-business, and application-to-application communication, will find this product to be a very cost-effective way to expand or replace their EDI implementations. Bluestone Visual-XML, a developer's toolkit (Beta available March, 1999) to help companies build XML-based applications. Users will be able to automatically generate document type definitions (DTDs), as well as XML documents, all through graphical drag and drop programming that generates pure Java and pure XML code. Visual-XML is an integrated development environment tool that allows developers to tie to any database or back-end data and business object. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Wed Feb 10 07:22:32 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:08:56 2004 Subject: What Clean Specs Achieve Message-ID: <87256713.0081242C.00@d53mta03h.boulder.ibm.com> >>> >>>But is anyone here trying to _implement_ Java? Lots of folks here are >>>indeed trying to _implement_ XML 1.0 (parsers and SAX), XLink and XPointer, >>>Namespaces, XSL, etc. It's not like we're only trying to _use_ them, as is >>>the case with Java (or SQL, another example that's been bounced around.) >> >>Most of them seem to be succeeding. What should we conclude? -Tim > >Most people who don't succeed, don't announce. We can't conclude anything. > >Judging from the volume of questions (and controversy) on this and its >sibling lists (XSL-list, xlxp-dev), there's a lot of improvement that could >be made. As an above average developer, who just implemented the bulk of his first XML parser (C++) in a binge over the last month, I have to question whether any 'average' developer will ever implement a full featured parser. I found it very non-trivial to write an XML parser that was well decomposed and layered and pluggable, while retaining competitive performance. I found that XML itself was not very conducive to fast processing and reasonably simple architecture. As to the spec... I don't mean to hurt anyone's feelings, but I found the spec during that effort to be as confusing as enlightening. It describes the logical (sometimes illogical :-) design of XML. But it doesn't help so much when it comes to trying to apply that to some physical design. Of course that's not their job, but obviously there have been a good number of parsers written and some obvious issues in implementation could be discussed, to save implementers from doing the same things over and over again and then having to fix them. Of course now its all obvious :-) But I had to really struggle through it the first time. A 4 or 5 page prose document describing the most obviously implementation pitfalls (and possibly some obvious implementation strategies) could have saved me a week probably. Yes the spec is supposed to describe XML, but is its overall goal not to facilite the development of software that implements it? And I suspect that perhaps there are probably parsers out there, where the developers really cannot intellectually prove that they do the right thing. I would be willing to bet that some of them just fix problems until it runs the James Clark tests and digest the Bosak files? When a customer reports a problem, and sends in a sample file, then they look at the spec and try to see if that file seems to correspend to the spec and fix their code to handle if so. That is far easier than trying to prove that every method in your code meets the spec (though its obviously not the optimum thing to do.) Am I being too cynical here? Maybe so. But, I just don't think that an 'average' developer could write an XML processor that is complete, expandable, maintainable, and speedy, if all he/she had to work with was the raw XML spec (at least not in a time that would be acceptable in a commercial setting, which is what mostly counts I guess?) I think that it would more likely just be 'proven' to be correct through empirical testing, not through an ability to completely understand all the interactions expressed in the XML spec and implement them cleanly. Also, the interactions that just exist in XML (regardless of how well or badly they are expressed in the spec) means that the skill level required to do something that is *maintainable and expandable* (i.e. well decomposed despite all the interactions) is that higher still. Arguing whether or not someone could manage to read the spec and squeeze something out that (in whatever shape) was a fully compliant parser, isn't very meaningful to me. Oh well, that's my po' two cents worth. I think that yes you need a dry laying out of the facts *and* some guidance at a higher level, related as much to possible implementation issues as interpretation issues. I think that the current spec perhaps is somewhere in between the two and thus somewhat fails to fully please either master? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Wed Feb 10 07:32:53 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:56 2004 Subject: RDF, Namespaces, and Versioning? References: <19990208104624.A4862@io.mds.rmit.edu.au> <370CE4A4.76D43491@prescod.net> <36BF35E7.D6FF0E5E@manhattanproject.com> <370CF891.21C0BDE9@prescod.net> Message-ID: <36C0CA35.89F83846@manhattanproject.com> Namespaces are used to name a contract (data interface) between organizations? Is this their practical application? If so, then, the biggest problem I see, is handling versions. >From my experience with any type of data format or exchange, the version of the format (or schema) is _very_ important. Has this been discussed? I didn't see the item addressed in the RDF specification (the only documented application of namespaces thus far). Best, Clark Evans xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From uche.ogbuji at fourthought.com Wed Feb 10 07:36:45 1999 From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com) Date: Mon Jun 7 17:08:57 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) In-Reply-To: Your message of "Mon, 08 Feb 1999 22:45:23 EST." <000b01be53de$9f4d21f0$d3228018@jabr.ne.mediaone.net> Message-ID: <199902100354.UAA05571@malatesta.local> > > > > I don't think it makes sense to build a business-object model > > on top of DOM, > > > > and then: > > > I have often ended up using DOM because I'm > > working in a > > distributed environment, using CORBA, and it makes it very easy > > and natural to > > just call DOM interfaces across the ORB I hope you're not pointing out the above as a contradiction. If so, context should have made it clear that when I do use DOM, it is never as a basis of the core business-object model, but as a UI or serialization interface. > I too work in a distributed environment, until the last year in DCOM but > increasingly using HTTP. > > In the multi-tier model, the client communicates with the middleware or > business-object tier using an RPC or distributed object protocol. For the > sake of discussion, lets assume that the DCOM wire protocol and CORBA IIOP > have roughly similar functionality as object RPC protocols. > > Good practices in such systems dictate that round-trips between the client > and business object layers be minimized (in fact round trips between tiers > in general should be minimized except that when objects exist in the same > process, or memory space, an object call as efficient as other in-process > calls. (e.g. in C++ this is a vtable call, the difference in Java is the > difference between a normal Java method invocation and an RMI invocation). > > Failure to mininize round trips is often the single biggest performance > drain on such systems otherwise well designed. This is a tenet of > distributed object systems design. > > While I have advocated judicious application of the DOM interfaces in > object systems, the DOM is best not employed as a business object itself > (i.e. the client ought not communicate directly with a distributed DOM), for > reasons outlined by Paul Prescod, as well as the fact that this use of the > DOM will *maximize* client-server roundtrips. For example a when a NodeList > is returned from a distributed DOM call, the client obtains a *proxy* to the > NodeList, and iteration through this proxy results in a server roundtrip for > each call. I am aware of DO network latency issues, but for the scale in which I have worked, a few basic optimizations minimize the ORB requests. And your post does make me mind the fact that I should be careful when discussing such matters here. Paul has spoken of needing to manage terabytes of data for his clients, and I don't know what size/structure DBs you usually work with, but I do not have experience deploying solutions invloving more than 20GB or so of data. It is possible that what to me is a minor and unnoticeable dilation would scale to unusability in a large-enterprise application. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Wed Feb 10 07:45:59 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:57 2004 Subject: Namespace Registry? And what about Architectures? References: <4EB4281B.662222A5@darmstadt.gmd.de> Message-ID: <36C0C92B.5520CD6F@manhattanproject.com> I just finished reading the RDF specifications on the web. It's nice that they are putting the classification scheme that they have come up with in their own namespace. This is good, and un-assuming. I approve. However, now I have two concerns: a) I feel that the RDF specification should show how architectures would be used. This to me is very critical. I don't see RDF being used directly, I only see it as a documentation mechinism that I can include in my DTDs. Does this make sence? In this way, if someone wants to define a transformation from my documents to theirs, they have two choices, they can define the transformation directly from mine, or they can define a transformation in terms of an RDF syntax, and thus, transitively handle my documents. Correct? b) And, much less importantly, to me a namespace refers to a global way to say who has defined a name. I know the current name space mechinism creates an "alias" for a URI, via the xmlns:alias="... URI ...". I would think, that it might be prettier to use full-fledged-reverse-url-like-thingys similar to Java. Thus, the "rdf" alias becomes "org.w3.rdf". I reason that a DTD, enabled with 10+ archetecture cross-references it would quickly become a nightmare figuring out who is "rdf". Is there a way to specify a DTD wide alias? <!NAMESPACE org.w3.rdf "http://w3.org/TR/1999/PR-rdf-syntax-19990105#" > <!NAMESPACE com.manhattanproject.pdb "http://manhattanproject.om/xml/pdb-rdf#" > Thus, you can then do fun stuff like <!ELEMENT x ... > <!ATTLIST x org.w3.rdf : type #FIXED "object" com.manhattanproject.pdb.1 : person #FIXED "true" > Anyway, I'm just playing... but when you want to name 50+ equivalences for your object (which may be common practice if you have about 100 suppliers... each with their own product definitions). I can't see how the plain old aliases will scale all that wonderfully. Of course, you could probably do something like this anyway, the namespace spec dosn't prevent a "." in an alias does it? Best, Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From kent at trl.ibm.co.jp Wed Feb 10 07:49:27 1999 From: kent at trl.ibm.co.jp (TAMURA Kent) Date: Mon Jun 7 17:08:57 2004 Subject: ANN: IBM XML Parser for Java (XML4J) v2.0.0 Message-ID: <199902100506.OAA39934@ns.trl.ibm.com> IBM XML4J v2.0.0 has been released. http://www.alphaworks.ibm.com/formula/xml XML4J v2 adds these exciting new features: o Configurable, Modular Architecture o High Performance o Revalidation o XCatalog Support # Send comments and questions about XML4J to xml4j@us.ibm.com, # not to kent@trl.ibm.co.jp. -- TAMURA, Kent @ Tokyo Research Laboratory, IBM Japan xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Wed Feb 10 08:15:28 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:57 2004 Subject: What Clean Specs Achieve References: <87256713.0081242C.00@d53mta03h.boulder.ibm.com> Message-ID: <36C13FD4.3FA53908@infinet.com> roddey@us.ibm.com wrote: > >>> > >>>But is anyone here trying to _implement_ Java? Lots of folks here are > >>>indeed trying to _implement_ XML 1.0 (parsers and SAX), XLink and > XPointer, > >>>Namespaces, XSL, etc. It's not like we're only trying to _use_ them, as > is > >>>the case with Java (or SQL, another example that's been bounced around.) > >> > >>Most of them seem to be succeeding. What should we conclude? -Tim > > > >Most people who don't succeed, don't announce. We can't conclude > anything. > > > >Judging from the volume of questions (and controversy) on this and its > >sibling lists (XSL-list, xlxp-dev), there's a lot of improvement that > could > >be made. > > As an above average developer, who just implemented the bulk of his first > XML parser (C++) in a binge over the last month, I have to question whether > any 'average' developer will ever implement a full featured parser. I found > it very non-trivial to write an XML parser that was well decomposed and > layered and pluggable, while retaining competitive performance. I found > that XML itself was not very conducive to fast processing and reasonably > simple architecture. Very true. Just to get something working that handled the data I was using took about two weeks of my time. One week reading the spec and asking questions and the other week writing the code. This was back in January of 1997 when there were no XML tutorials around (the spec was not even a recommendation then). > As to the spec... I don't mean to hurt anyone's feelings, but I found the > spec during that effort to be as confusing as enlightening. It describes > the logical (sometimes illogical :-) design of XML. But it doesn't help so > much when it comes to trying to apply that to some physical design. Of > course that's not their job, but obviously there have been a good number of > parsers written and some obvious issues in implementation could be > discussed, to save implementers from doing the same things over and over > again and then having to fix them. Of course now its all obvious :-) But I > had to really struggle through it the first time. A 4 or 5 page prose > document describing the most obviously implementation pitfalls (and > possibly some obvious implementation strategies) could have saved me a week > probably. Yes the spec is supposed to describe XML, but is its overall goal > not to facilite the development of software that implements it? I doubt that is the goal, but many people are hesitant to disclose their parsing "secrets" (-:. I think of XML parsers as pure commodities that you cannot make a penny off of unless you have some higher level tools built on top of a good parser framework. I have found that at least in Java, a lot of the things I learned while tuning performance were things which helped me out in a lot of areas of programming that have nothing to do with XML. I think Mr. Clark likes to refer to his generous works as reference implementations, however, XP is not something to easily learn from as it is very low-level and not very straightforward in terms of interfaces (not trying to disrespect Mr. Clark here as the XML parser I wrote may be fast but the code is practically unmaintainable as my extreme efforts at quality performance severely compromised good software engineering principles that I usually try and follow in my work). I think this can be said of just about all of the XML parser out there, they are all spaghetti except for perhaps Aelfred. > And I suspect that perhaps there are probably parsers out there, where the > developers really cannot intellectually prove that they do the right thing. > I would be willing to bet that some of them just fix problems until it runs > the James Clark tests and digest the Bosak files? When a customer reports a That is what I did for a long time. Debugging through the entire Clark test suite took a week or more and I still don't pass much more than 90% of the ones that test for not-well formed documents, but I suspect Mr. Clark spent a lot longer than a week doing the test suite (-: > problem, and sends in a sample file, then they look at the spec and try to > see if that file seems to correspend to the spec and fix their code to > handle if so. That is far easier than trying to prove that every method in > your code meets the spec (though its obviously not the optimum thing to > do.) Yah, generally if you control how your data is created, you can whip up a decent parser to meet your needs. Also, if you don't check for a lot of the obscure errors that may pop up you can save yourself a ton of time in processing overhead. Unfortunately, in my case the XML parser will be used in an end-user product where users may edit files manually (and screw things up in the process). But if you just want to have some basic XML capabilities for your organization and don't want to deal with using other people's codebases, XML is not too much of a beast (understanding the spec takes longer than writing the code at first). > Am I being too cynical here? Maybe so. But, I just don't think that an > 'average' developer could write an XML processor that is complete, > expandable, maintainable, and speedy, if all he/she had to work with was > the raw XML spec (at least not in a time that would be acceptable in a > commercial setting, which is what mostly counts I guess?) I think that it > would more likely just be 'proven' to be correct through empirical testing, > not through an ability to completely understand all the interactions > expressed in the XML spec and implement them cleanly. Very true. I fell into this trap when people on this list were talking about how an average university CS student could whip one up in a week. At first I said "geese this is easy" but when I started caring about performance and being able to detect some of the very obscure errors to be 100% compliant with the draft, I found myself going insane on doing a lot more work with XML than I originally intended. Then this XML stuff balloned into a bunch of XML related work for several clients with these tools and now I am here discussing XML with everyone else when all I intended at first was to just have basic XML support in the core application I was working on. > Also, the interactions that just exist in XML (regardless of how well or > badly they are expressed in the spec) means that the skill level required > to do something that is *maintainable and expandable* (i.e. well decomposed > despite all the interactions) is that higher still. Arguing whether or not > someone could manage to read the spec and squeeze something out that (in > whatever shape) was a fully compliant parser, isn't very meaningful to me. I could not agree more. > Oh well, that's my po' two cents worth. I think that yes you need a dry > laying out of the facts *and* some guidance at a higher level, related as > much to possible implementation issues as interpretation issues. I think > that the current spec perhaps is somewhere in between the two and thus > somewhat fails to fully please either master? You can thank the many people here who have provided open-source parsers to work from (I was never able to actually get mine out in open-source form as I originally intended for various business reasons), though I myself decided to waste a lot of time coming up with an XML architecture that works very differently from the event-based or tree-based parsers out there as it is more of a data-driven model than anything else (oh I forgot to mention Lark from Tim Bray which uses a DFA model that is unique to the current crop of XML parsers). I would say Aelfred is the best "reference" implementation out there if you could call it that and anyone who just wants to whip up a decent event-based XML parser should take a look at his source as it is pretty clean and straightofrward. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Wed Feb 10 08:33:05 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:57 2004 Subject: Fractal XML Index Notation References: <Pine.LNX.3.96.990201200007.10159C-100000@eccnet.eccnet.com> Message-ID: <36C0D2C9.8B55203B@manhattanproject.com> Abstract: By fixing the content of an XML file, a position based index mechanism can be added to XML files, allowing fractal parsing. Introduction: In a thin-client/server environment, especially those implemented in an interpreted language, like Java, is important to minimise client-side processing by doing server-side pre-processing. For example, suppose that an on-line shopping web site has a thin-client ordering java applet. It could quickly download, and start accepting customer information, and other input. Simoutanenously, it could be downloading a 250K+ file(s) containing the package and product list, authorized shipping agents, tax calculation tables, etc. Advanced versions of the applet would "cashe" a copy of the catalog locally, and only download deltas. Several pre-processing items could occur, the most obvious being a translation of the normalized schema: PRODUCT_CATEGORY (CATEGORY_ID, CATEGORY_NAME) BUNDLE_OF_PRODUCTS (BUNDLE_ID, BUNDLE_NAME, BUNDLE_PRICE) VENDOR (VENDOR_ID, VENDOR_NAME) BUNDLE-PRODUCT (BUNDLE_ID,PRODUCT_ID) PRODUCT (PRODUCT_ID, PRODUCT_NAME, CATEGORY_ID, INVIDUAL_SALE_FLAG, PRICE_IF_SOLD_INDIVIDUALLY ) PRODUCT-VENDOR (PRODUCT_ID,VENDOR_ID) BUNDLE-VENDOR (BUNDLE_ID,VENDOR_ID) into a hierarchical drill-down that better meets the particular needs of the order-entry client: <catalog> <product-category> <product-bundle> <product> <vendor> <individual-product> <vendor> In this example, several joins are interwoven into a a single hierarchical "snapshot" to support the the drill down requirements in the order-entry client. Notice, that product-bundles, products, and vendors *will* be duplicated with this scheme, this de-normalization is exactly what is required since it makes the processing on the client simpler. Here XML complements the relational database by providing a de-normalized stream of data instead of a normalized repository. For another example, suppose a roaming-sales person receives an update every morning in his e-mail with new products, discontinued products, changes in pricing, packaging, etc. Then, during the day, the sales peson goes "door-to-door" selling the products and taking orders. The orders are collected on his/her hard drive untill the evening, when they are uploaded to the server for approval. I see XML as a great move forward in a standard transport layer for this form of communication. Each order could be a simple e-mail message, leveraging existing POP3/SMTP standards. The messages would be queued during the day, and send after the sales person is connected to the network. In a similar way, the updates to the product could be sent as via e-mail (xml-mail anyone?) as well. THUS, we have moved the join from the client to the server, but now, we have *increased* the parsing requirements of the client... also, with a _large_ catelog file (3+MB?), it is unreasonable to think that a collection of objects in memory would be the result of the parsing. THEREFORE, some form of storage/retrieval is necessary on the client. This can be in a local database, but that just increases the footprint and processing. Instead of making a client-side database, and re-normalizing the information, I suggest that indexing the XML file may be a better alternative. A way to do this, is to "fix" the XML file's binary representaion, and build a physical index detailing the "exact" location of an element within the file. Requirement for such an index: a) It should be embeddable inside XML, and should follow XML if possible (perhaps it is a notation?) b) It should allow indexing on arbitrary element attributes. c) It should be created so that a change in one part of the file has minimal impact on the rest of the XML file. Thus, although a change to a child may require a re-adjustment of information about it's parent, it shouldn't require re-adjustment of information about each sibling. d) It should take advantage of the "hierarchy" built into the XML file, since the thin-client usage will directly correspond to the "hierachy" e) It should support typed entities and attributes "Archetecutres", so that different attribute names of sub-types can be indexed together. f) Indexing an element based upon it's child elements may not be required. If an index like this is needed, perhaps a re-write adds an attribute with the computed value and then this is indexed instead. g) Working with linking is purely optional, and may not be important to support. <opinion> If you are using linking with transaction-oriented documents, you should be using a relational database instead. I see XML as bringing back the Hierarchical database to *complement* relational technology, not to *replace* it.</opinion> ================================================ What I propose is a "fractal" index inter-woven into the XML data. First, here is the file to be indexed: <catalog date="03-FEB-1999" company="Acme Tools" > <product-category name="Household" type="Domestic"> <individual-product name="Hammer" price="13.95"/> <individual-product name="Driver, 1/4 inch" price="6.95"/> <individual-product name="Driver, 1/8 inch" price="7.95"/> <individual-product name="Allen-Wrench Set" price="11.55"/> <product-bundle name="Household-Starter" price = "23.99" /> <bundled-product name="Hammer"/> <bundled-product name="Driver, 1/4 inch"/> <bundled-product name="Driver, 1/8 inch"/> ... </product-bundle> ... </product-category> <product-category type="Commercial" name="Light-Industry" > <individual-product name="Hammer" price="13.95"/> <individual-product name="Versa Screw(tm)" price="66.95"/> ... </product-category> ... </catalog> Here is the "indexed" example, I use line numbers for the demonstration since it is easier to show in e-mail form, however, I would see it being done by position instead. I also use <!-- to comment stuff. --> 0001 <!-- other-information-before-the-catelog --> ... 0009 <catalog date="03-FEB-1999" company="Acme Tools" > 0010 <product-category name="Household" type="Domestic"> 0011 <individual-product name="Hammer" price="13.95"/> 0012 <individual-product name="Driver, 1/4 inch" price="6.95"/> 0013 <individual-product name="Driver, 1/8 inch" price="7.95"/> 0014 <individual-product name="Allen-Wrench Set" price="1.55"/> 0015 <product-bundle name="Household-Starter" price = "23.99" /> 0016 <bundled-product name="Hammer"/> 0017 <bundled-product name="Driver, 1/4 inch"/> 0018 <bundled-product name="Driver, 1/8 inch"/> ... 0033 </product-bundle> ... 0533 <index <!-- an index for "Household" category --> 0534 name="Price" <!-- the listing is asending by price --> 0535 index-start=525 <!-- (535-10), relative begining of index --> 0536 delimiter="|" <!-- Hmm, possibly for readability --> 0536 position-width=4 <!-- Length for each position, lpad="0" --> 0537 length=100 <!-- Length of index --> 0538 > 0539 <index-column name="name" width=30 align="left" rpad=" "> 0540 <index-element element="individual-product" attribute="price" /> 0541 <index-element element="product-bundle" attribute="price" /> 0542 </index-column> 0543 0004|Allen-Wrench Set | <!-- First item... --> ... 05?? 0005|Household-Starter | ... 05?? 0008|Allen-Wrench Set | ... 0632 </index> 0633 <index 0634 name="Price" <!-- the index is asending by price --> 0635 index-start=625 <!-- (635-10), relative begining of index --> 0636 delimiter="|" 0636 position-width=4 0637 length=100 0638 > 0639 <index-column name="price" width=5 align="right" lpad="0"> 0640 <index-element element="individual-product" attribute="price" /> 0641 <index-element element="product-bundle" attribute="price" /> 0642 </index-column> 0643 0433|01.23 <!-- Cheapest item... --> ... 06?? 0002|06.95 <!-- Refers to line 10+2=12 --> ... 06?? 0005|23.99 <!-- Referrs to line 10+5=15 --> ... 0732 </index> .... ???? </product-category> ???? <product-category type="Commercial" name="Light-Industry" > ???? <individual-product name="Hammer" price="13.95"/> ???? <individual-product name="Versa Screw(tm)" price="66.95"/> ... ???? <index name="Price" .... ???? <index name="" .... ???? </product-category> .... ???? I'm getting tired here, but now, you would have the next level of indexes, here indexing the product categories, instead of the products in the product category. ???? </catalog> ------------------------------------------- Anyway, I wrote this up about 2 months ago and have implemented something *very* similar to it successfully in a client/server order entry system about 6 years ago using flat files and a proprietary file format. I did this beacuse the client parsing times were getting to be large and the frequent updates were starting to cause problems. Anyway, the speedup obtained by a multi-attribute index was hudge and well worth the added complexity. Since the index is fractal at the element level, a change in the file means re-computing the indexes for all parents transitively. Re-computation of the sibling indexes is avoided! As an added advantage, carrage returns were used so that "upgrades" of the files could be distribued merged with "diff". Just thinking outloud here.... Clark Evans xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Wed Feb 10 10:31:57 1999 From: Michael.Kay at icl.com (Michael.Kay@icl.com) Date: Mon Jun 7 17:08:57 2004 Subject: "Namespaces in XML" idea? Message-ID: <93CB64052F94D211BC5D0010A80013310EB2E3@WWMESS3> > A pragmatic reason for passing on the xmlns:xyz information is that some > applications might *want* to go beyond what 'Namespaces in XML' itself > provides and use similarly abbreviated values _inside_ > data attributes... And some applications, such as an XSL Stylesheet processor, might *need* to do so. (I haven't yet worked out how namespace prefixes inside XSL patterns actually work, but I'm sure the stylesheet processor needs to be aware of them); and the same is true of any other application that refers to element names within its data. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From hpyle at agora.co.uk Wed Feb 10 11:18:43 1999 From: hpyle at agora.co.uk (hpyle@agora.co.uk) Date: Mon Jun 7 17:08:57 2004 Subject: Roll-Your-Own Parsers (was: Re: What Clean Specs Achieve) Message-ID: <80256714.003E056F.00@mailhost.agora.co.uk> Tyler Baker <tyler@infinet.com> wrote, > > (roddey@us.ibm.com) > > see if that file seems to correspend to the spec and fix their code to > > handle if so. That is far easier than trying to prove that every method in > > your code meets the spec (though its obviously not the optimum thing to > > do.) > Yah, generally if you control how your data is created, you can whip up a decent parser to > meet your needs. Also, if you don't check for a lot of the obscure errors that may pop up you > can save yourself a ton of time in processing overhead. ... > But if you just want to have some basic XML capabilities for your > organization and don't want to deal with using other people's codebases, XML is not too much > of a beast (understanding the spec takes longer than writing the code at first). > > > Am I being too cynical here? Maybe so. But, I just don't think that an > > 'average' developer could write an XML processor that is complete, > > expandable, maintainable, and speedy, if all he/she had to work with was > > the raw XML spec... > Very true. I fell into this trap when people on this list were talking about how an average > university CS student could whip one up in a week. At first I said "geese this is easy" but > when I started caring about performance and being able to detect some of the very obscure > errors to be 100% compliant with the draft, I found myself going insane on doing a lot more Totally agree. There will always be a tradeoff between code size, performance and conformance to the spec. We have taken the same approach: for XML which might go outside our environment or some in from outside, we use a heavyweight parser with full validation. But where it's "behind the covers" we use a homegrown (tiny, nonconformant) parser and just check the structures a few times during design, with a validating parser. > You can thank the many people here who have provided open-source parsers to work from Ditto. -Hugh hpyle@agora.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Feb 10 11:59:13 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:57 2004 Subject: RDF, Namespaces, and Versioning? In-Reply-To: <36C0CA35.89F83846@manhattanproject.com> References: <19990208104624.A4862@io.mds.rmit.edu.au> <370CE4A4.76D43491@prescod.net> <36BF35E7.D6FF0E5E@manhattanproject.com> <370CF891.21C0BDE9@prescod.net> <36C0CA35.89F83846@manhattanproject.com> Message-ID: <14017.29209.142742.434795@localhost.localdomain> Clark Evans writes: > Namespaces are used to name a contract (data interface) between > organizations? Is this their practical application? Namespaces create globally-unique names. Once you have globally-unique names, you can hang your own baggage on those names. I'd agree that two implications grow out of the use of URIs for namespaces: 1. [ownership] the namespace URI will be defined by the URI owner (or with the owner's permission); and 2. [uniqueness] the URI owner will ensure that the same unique name (URI + local part) is not used for two contradictory purposes. In other words, I cannot define the namespace URI "http://www.microsoft.com/ns/", and I cannot use "{http://www.megginson.com/ns}result" in two different specs for two completely different purposes. > If so, then, the biggest problem I see, is handling versions. > >From my experience with any type of data format or exchange, > the version of the format (or schema) is _very_ important. As far as Namespaces is concerned, the namespace URI is a black box -- it doesn't point to anything, and it doesn't mean anything. The creator, however, is free to add internal structure: http://www.megginson.com/ns/business/1999-01-29/ http://www.megginson.com/ns/business/1999-02-09/ etc. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Wed Feb 10 12:19:21 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:08:57 2004 Subject: XML Parser with DOM in C Message-ID: <5F052F2A01FBD11184F00008C7A4A80001136B15@eukbant101.ericsson.se> There's always XML::DOM - a perl module, that you could use as one option should you not find any others. There are multiple options for calling perl code from C. There's probably other scripting, or even Java options that you could call (e.g. you could use COM). It's probably not the interface you are looking for though. I don't expect DOM to map onto C very well though - I can't quite picture how you would do it, but then my C experience hasn't been brushed up on for a while. Matt. -- http://come.to/fastnet Perl on Win32, PerlScript, ASP, Database, XML GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ > -----Original Message----- > From: David E. Cleary [SMTP:davec@progress.com] > Sent: Monday, February 08, 1999 9:26 PM > To: xml-dev@ic.ac.uk > Subject: XML Parser with DOM in C > > I'm looking for a C based XML parser with the DOM API to license. Doesn't > have to be free or open source. Or is anybody working on putting the DOM > on > top of expat besides Mozilla? > > David Cleary > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rja at arpsolutions.demon.co.uk Wed Feb 10 12:24:45 1999 From: rja at arpsolutions.demon.co.uk (Richard Anderson) Date: Mon Jun 7 17:08:57 2004 Subject: XML Parser with DOM in C Message-ID: <000601be54f0$63cfce80$c5010180@p197> Are you looking for a specific platform ? We've got an XML C++ parser with DOM/SAX support that should be ready for beta in a week or so. ( We've had long delays I know ) Its currently been built using Visual C++ version 5 and 6, but it should work on other platforms as just uses C++ and STL. Regards, Richard. *********************************************** * E-Mail mailto:RJA@arpsolutions.demon.co.uk * * WEB http://www.arpsolutions.demon.co.uk * *********************************************** -----Original Message----- From: Matthew Sergeant (EML) <Matthew.Sergeant@eml.ericsson.se> To: xml-dev@ic.ac.uk <xml-dev@ic.ac.uk> Date: 10 February 1999 12:20 Subject: RE: XML Parser with DOM in C >There's always XML::DOM - a perl module, that you could use as one option >should you not find any others. There are multiple options for calling perl >code from C. > >There's probably other scripting, or even Java options that you could call >(e.g. you could use COM). It's probably not the interface you are looking >for though. I don't expect DOM to map onto C very well though - I can't >quite picture how you would do it, but then my C experience hasn't been >brushed up on for a while. > >Matt. >-- >http://come.to/fastnet >Perl on Win32, PerlScript, ASP, Database, XML >GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V >!PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ > >> -----Original Message----- >> From: David E. Cleary [SMTP:davec@progress.com] >> Sent: Monday, February 08, 1999 9:26 PM >> To: xml-dev@ic.ac.uk >> Subject: XML Parser with DOM in C >> >> I'm looking for a C based XML parser with the DOM API to license. Doesn't >> have to be free or open source. Or is anybody working on putting the DOM >> on >> top of expat besides Mozilla? >> >> David Cleary >> >> >> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on >> CD-ROM/ISBN 981-02-3594-1 >> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >> (un)subscribe xml-dev >> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following >> message; >> subscribe xml-dev-digest >> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Wed Feb 10 16:11:31 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:08:57 2004 Subject: RDF, Namespaces, and Versioning? References: <19990208104624.A4862@io.mds.rmit.edu.au> <370CE4A4.76D43491@prescod.net> <36BF35E7.D6FF0E5E@manhattanproject.com> <370CF891.21C0BDE9@prescod.net> <36C0CA35.89F83846@manhattanproject.com> <14017.29209.142742.434795@localhost.localdomain> Message-ID: <36C1AE8C.270AAB95@manhattanproject.com> David Megginson wrote: > > As far as Namespaces is concerned, the namespace URI is a black box -- > it doesn't point to anything, and it doesn't mean anything. The > creator, however, is free to add internal structure: > > http://www.megginson.com/ns/business/1999-01-29/ > http://www.megginson.com/ns/business/1999-02-09/ > First, thank you for your reply. It was helpful. However, I do know, the spec dosn't say _how_ to use namespaces. When you design something, it helps to have several "use cases" that describe _how_ the software/standard/etc. will/could be used to solve a real world problem. This is what I'm after. How does one do versioning? Let's say that one of your clients, BigTools, was using your architecture http://www.megginson.com/ns/business/1999-01-29/ to describe their products. Now lets say that I'm the author of a simple tool vendor search engine. Suppose this engine only needs information defined by a small subset of your architecture. To make things interesting, let's say I do this for 500 vendors.... Now. One year later, you make a few "backward" compatible changes to the architecture, and convert all of your clients, including BigTools, to this new version. Shouldn't there be a way to mark the namespace so that it is knowable (from the declaration?) that http://www.megginson.com/ns/business/1999-01-29/ is a proper subset of the newer: http://www.megginson.com/ns/business/1999-02-09/ _or_ a way to know that they are not backward compatible? Hmm. I suppose that your new architecture could define, in its definition, the mapping to the old architecture. Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Wed Feb 10 16:50:24 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:08:58 2004 Subject: XML Parser with DOM in C Message-ID: <005b01be5515$3c339440$26f96d8c@NT.JELLIFFE.COM.AU> From: David E. Cleary <davec@progress.com> >I'm looking for a C based XML parser with the DOM API to license. Doesn't >have to be free or open source. Or is anybody working on putting the DOM on >top of expat besides Mozilla? I have been working on something like this. Please contact me for information. This is *not* a product announcement. DOM is a very simple and generic interface system, aimed at access to already-built trees (e.g., GROVES). It is a little contradictory that DOM uses IDL, and hence is useful for networked access to tree objects, since that is what XML was for too! I guess XML is for loosely-coupled systems... The kinds of decisions you need to make with implementing DOM in C include: * is it part of a distributed object systems (e.g. CORBA) (mine isnt); * do you really want to use wide chars rather than UTF-8 (for C, there are many good readons to keep with UTF-8 as much as possible); * what is your implementation of an object reference) (if there is no networking, it can be a pointer); * what is your implementation of the underlying data structures (which implies an idea of the probable uses of your system); this is in fact the biggest part; * do you really want to pass the environment back (if you are not in multi-threaded or CORBA, why not just pass errors back in a global, instead of wasting a function argument?) Why would anyone be interested in a C non-CORBA DOM? Well, the reasons I would give are: * to have consistant (but highly generic) access to structures regardless of the DTD, and regardless of the markup notation used (XML, CGM); there are generic operations (e.g., asking the type of an element node) which a generic inteface should be well appropriate for; * to be able to upgrade to a distributed object implementation in the future; if you are like me, maybe you don't care or need to learn CORBA at the moment, but you would like you program to use the basic DOM operations; * you have to use C and you do not need CORBA. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Wed Feb 10 17:25:38 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:08:58 2004 Subject: RDF, Namespaces, and Versioning? Message-ID: <00ca01be551a$72239dc0$0300000a@othniel.cygnus.uwa.edu.au> >When you design something, it helps to have several >"use cases" that describe _how_ the software/standard/etc. >will/could be used to solve a real world problem. This >is what I'm after. How does one do versioning? Don't try to make namespaces do too much. Namespaces are for universal names. Anything else, whether it be validating mixed vocabulary documents or versioning schemata, is going to have to rely on a mechanism other than namespaces (although namespaces might be part of the solution). XSL is an excellent example of namespaces in action. XSL has its own template vocabulary and this vocabulary gets mixed with whatever vocabulary is used for the result tree. To avoid possible name clashes between names in the template vocabulary and those in whatever vocabulary is used for the result tree, you can use names spaces. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Paul.V.Biron at kp.ORG Wed Feb 10 19:04:57 1999 From: Paul.V.Biron at kp.ORG (Biron,Paul V) Date: Mon Jun 7 17:08:58 2004 Subject: Word and XML (was: XML standards coherency and so forth) Message-ID: <0F7119373E69D2118F6F00805FE68639BEAD67@gren-exch-rcvy.kpscal.org> > From: "Rick Jelliffe" <ricko@allette.com.au> > Date: Sun, 24 Jan 1999 16:15:36 +1100 > Subject: Re: Word and XML (was: XML standards coherency and so forth) > > From: Biron,Paul V <Paul.V.Biron@kp.ORG> > > >Word 97 also produced several well-formedness violations when doing > anything > >more than simple nested lists. > > Dave Ragget's program "tidy" is excellent for fixing up badly formed > HTML and making it valid (it figures out which HTML DTD the document is > valid according to, and generates the appropriate DOCTYPE for it). It > also is great for converting to HTML-in-XML (e.g. our website > www.ascc.net/xml/ uses it). > > The program is available at > http://www.w3.org/People/Raggett/tidy/ > > I think website developers should consider making tidy a standard part > of website maintenance. Each HTML editing program can do strange things > to markup; using tidy on the maintenance fileset and then updating the > website fileset is a good way to keep a WF site. without forcing you to > give up non-WF tools. > > Rick > Wow! I've been so busy lately that I haven't been able to keep up with XML-DEV and had no idea my "innocent" post on Word and HTML/XML had been so long lived! On this matter, tidy was one of the first "fix-it" approaches we tried. Unfortunately, tidy doesn't happen to fix this particular problem. Tidy does many, many VERY important things! Fixing this problem is not one of them. The HTML produced by Word '97 from my example is: <P>This is <B>a test <I>of the</B> emergency</I> broadcast system</P> The output produced by tidy (22jan99 version) is: <P>This is <B>a test <I>of the</I> emergency</B> broadcast system</P> While this is "well-formed" HTML (it does not contain improper nesting), it is NOT the output that is wanted. The problem is that in the original, the BOLD stops after "the" (where it should stop); in the tidy version it continues until after "emergency". The output that Word should have originally output is: <P>This is <B>a test <I>of the</I></B> <I>emergency</I> broadcast system</P> That is, the fix is to insert a </I> when the </B> is seen and then to reopen <I> after the </B>. Tidy just replaces the </B> with </I> and then replaces the original </I> with </B>. The only tool I've found so far that fixes this problem correctly is FrontPage v1.1 (about 4 years old, funny they had it working back then:-). In truth, we've spent a great deal of time writting tools (a big daisy chain of FrontPage v1.1 -> hand-roled perl script 1 -> hand-roled perl script 2 -> etc.) just to HTML output from Word '97. What has made this all the more fustrating for us is that the HTML is not really what we want in the end. We just want a "clean" HTML version so that the transformation to the XML DTD that we're interested in is "easier". The BOLD and ITALIC that our authors see actually represent more "semantic" XML elements, e.g., <allergy> and <medication>. Such is life. Paul V. Biron SGML Business Analyst Kaiser Permanente, So Cal. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Feb 10 19:12:34 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:58 2004 Subject: Architectural Forms and Namespaces (Was: Re: SAX, Java, and Namespaces ) In-Reply-To: <36BB2E4F.6524A575@manhattanproject.com> References: <8025670F.002EF9AD.00@mailhost.agora.co.uk> <36BAF974.6F87E679@prescod.net> <14010.65394.42870.480866@localhost.localdomain> <36BB2E4F.6524A575@manhattanproject.com> Message-ID: <14017.55602.169287.972307@localhost.localdomain> Clark Evans writes: > > Of course, I know that I could do all of this with > > architectural forms as well. > > You can do _all_ of it with both? I had pictured > a combination punch to solve the problem. I see > namespaces and architectural forms as yet another > complementary system within XML. Architectural forms can do everything that namespaces do, but not in such a web-friendly way. Namespaces can handle only a small subset of what architectural forms can do. > I don't see one or the other used. I see them > being used in combination. What am I missing? The AF equivalent of a qualified name is the URL or public ID of the meta-DTD file combined with the architectural form. It hasn't really taken off, though, and I agree that self-identifying AFs (i.e. qualified names) show some promise. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Wed Feb 10 19:21:30 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:08:58 2004 Subject: what is "completely different" [Re: RDF, Namespaces, and Versioning?] References: <19990208104624.A4862@io.mds.rmit.edu.au> <370CE4A4.76D43491@prescod.net> <36BF35E7.D6FF0E5E@manhattanproject.com> <370CF891.21C0BDE9@prescod.net> <36C0CA35.89F83846@manhattanproject.com> <14017.29209.142742.434795@localhost.localdomain> Message-ID: <36C1DD6C.7F4E7648@mecomnet.de> What is the concensus (?!) here on how one should interpret "completely different" in the passage below? David Megginson wrote: > > In other words, ... I cannot use > "{http://www.megginson.com/ns}result" in two different specs for two > completely different purposes. > I understand that the intended reference was to declarations in two different documents. Although not noted in the original note, the case of conflicting use within a single document is clear without discussion. Which means that "two different specs" could well be a resource with the same URI which is included into two distinct documents. If the name is an element name, would all definition instances be required to have identical content models? Or maybe "equivalent" models? (For example, under equivalence classes ANY / EMPTY / element content / pc content / mixed content.) Equivalent models for all constituents which are in the same namespace as the element name? Equivalent models for all constituents which appear in namespaces declared in some "canonical" DTD? If the name is an attribute name, would all definitions be required to be identical? All definitions for elements with identical names? Or maybe all definitions must specify the same attribute type? Or class of attribute type (whereby all enumerated types are "equivalent")? The question is, how much is an application permitted to cache? If it has seen all the names before, does it need to fetch the "definitions" again, or can it reuse the ones it has? Is the answer to this question different for attribute and element declarations that it is for entities? Maybe there is a distinction to be made between structural and non-structural entities? My experience with sgml is limited to reading, but it indicates that such expectations would be disappointed in that world. Will the presence of universal names perhaps permit different expectations for distributed compound XML documents? ? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Feb 10 19:35:42 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:58 2004 Subject: what is "completely different" [Re: RDF, Namespaces, and Versioning?] In-Reply-To: <36C1DD6C.7F4E7648@mecomnet.de> References: <19990208104624.A4862@io.mds.rmit.edu.au> <370CE4A4.76D43491@prescod.net> <36BF35E7.D6FF0E5E@manhattanproject.com> <370CF891.21C0BDE9@prescod.net> <36C0CA35.89F83846@manhattanproject.com> <14017.29209.142742.434795@localhost.localdomain> <36C1DD6C.7F4E7648@mecomnet.de> Message-ID: <14017.56498.749101.633745@localhost.localdomain> james anderson writes: > What is the concensus (?!) here on how one should interpret > "completely different" in the passage below? > > David Megginson wrote: > > > > In other words, ... I cannot use > > "{http://www.megginson.com/ns}result" in two different specs for two > > completely different purposes. By 'use', I actually meant 'define', but James's question is still worth tackling. James asks about specific points like content models, attribute types, etc. -- what is the proper use of a name defined in a namespace? I'd suggest that that's all dependent on the degree of specification. As an example, consider the specification for elements named {http://www.megginson.com/ns/foobar/}place. Here's one possible example: An element named {http://www.megginson.com/ns/foobar/}place shall always contain the name of a geographical location. Since I've used the verb "contain", I think that I cannot put the name in an attribute value (since attributes are not contained in an element): [WRONG] <megg:place name="Ottawa"/> However, I've said nothing about what the name should look like or how it should be structured, so all of the following should be conformant: [OK] <megg:place>Ottawa</megg:place> [OK] <megg:place>Ottawa, Ontario, Canada</megg:place> [OK] <megg:place>The city of Ottawa</megg:place> [OK] <megg:place> <city>Ottawa</city> <province>Ontario</province> <country>Canada</country> </megg:place> [OK] <megg:place> Ottawa<br/> Region of Ottawa-Carleton<br/> Ontario<br/> Canada </megg:place> [OK] <megg:place> <item>Ottawa</item> <item>Ontario</item> <item>Canada</item> </megg:place> If I want something more specific, I have to give it in the definition (perhaps by supplying BNF or a content model, or even by specifying the allowed structure of the character data). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Wed Feb 10 20:51:42 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:08:58 2004 Subject: Roll-Your-Own Parsers (was: Re: What Clean Specs Achieve) In-Reply-To: <80256714.003E056F.00@mailhost.agora.co.uk> Message-ID: <199902102051.PAA20880@hesketh.net> Hugh wrote: >There will always be a tradeoff between code size, >performance and conformance to the spec. We have taken the same approach: >for XML which might go outside our environment or some in from outside, we >use a heavyweight parser with full validation. But where it's "behind the >covers" we use a homegrown (tiny, nonconformant) parser and just check the >structures a few times during design, with a validating parser. If we could work with parser layers rather than parsers, this might become a lot easier to manage. We could just turn on the parts we need and turn off the ones we don't. I'm hoping to build an open and extensible parser based on the approach I'm outlining in that "Layered Model for XML Processing" document (http://www.simonstl.com/articles/layering/layered.htm) over the summer. Open source, open architecture, open model. It'll be event-based for now, maybe with a tree-builder at the end. Tilting at windmills, a glorious hobby. Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rick at activated.com Wed Feb 10 21:59:05 1999 From: rick at activated.com (Rick Ross) Date: Mon Jun 7 17:08:58 2004 Subject: The Peace Process: DOM and namespaces... Message-ID: <36C2011B.D188E44B@activated.com> Dear XML/XSL Community, I need your ear, and I sincerely do not want to cause trouble. We are in a conundrum at Activated Intelligence because we earnestly wish to support open standards well in our XML products, but the XSL working draft specification requires namespace support that apparently cannot be implemented effectively if the primary input source is a dynamically built DOM tree. It is impossible to overstate the business value and importance of DOM trees as a form of input. We have spent many long hours and many, many dollars evaluating the business needs for XML/XSL - and our best insights suggest that DOM tree representations will be the most natural and frequently occurring form of in-memory representation of XML data. Our XSL processor is designed to work with any XML parser that implements DOM and SAX support - a fantastic benefit of reliance on open standards. Unfortunately, if the DOM api is not rich enough to support namespaces in XML effectively, then DOM becomes a second-rate interface for XML/XSL application solutions. There is no other object representation of XML data that even approximates a standard as DOM does - and it is completely natural for programs to project data that originates from databases and other sources though a DOM tree. More importantly, if business logic must repeatedly access and manipulate data sets that change only in minor ways over time, then it makes great sense to act directly upon the in-memory DOM tree rather than reparsing XML input over and over again. LotusXSL is a rich and full implementation of the XSL spec, but it appears to pay a dear price in performance for supporting both namespaces and DOM - using a brute force solution that will not suffice for those who need to perform hundreds of thousands or millions of XSL processing operations per day. And what else could they have done? Perhaps gone the route of XT? Doesn't that mean there is then is no DOM support at all - an even more extreme price - one which closes the door on supporting any standard object representation of input data in XML parsers. What application can afford to completely reparse input data every single time XSL processing is desired? Application business logic will be faster and more effective if it is reliably able to operate on a DOM tree representation of the data it is manipulating. If not DOM, then we NEED some other standard for this type of representation - one which supports namespaces in XML the way they are handled and specified in XSL. Or else the way namespaces are handled in XSL needs to change... It is not reasonable to discard the powerful value of DOM tree representations of input data because the namespaces in XML recommendation forces an either/or choice between DOM and namespaces. This is not an academic problem, or even a computing science problem - it is straight business - and the people who have drafted the spec have made an oversight that simply demands correction. We need a change that permits namespaces and DOM to co-exist in XSL without requiring a huge performance penalty. I hope others will share their comments, and that the XSL working group will pay attention to whatever discussion ensues. Simply dismissing DOM is not an acceptable option - since there is no alternative. Saddling XSL with a huge performance hit due to unnecessarily required reparsing is also unacceptable. Surely there must be effective middle ground? I hope our dialog can reveal it. Rick Ross Activated Intelligence, LLC -------------- next part -------------- A non-text attachment was scrubbed... Name: rick.vcf Type: text/x-vcard Size: 311 bytes Desc: Card for Rick Ross Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990210/b53fcc42/rick.vcf From tbray at textuality.com Wed Feb 10 22:06:45 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:58 2004 Subject: The Peace Process: DOM and namespaces... Message-ID: <3.0.32.19990210140521.00bde160@pop.intergate.bc.ca> At 04:58 PM 2/10/99 -0500, Rick Ross wrote: >We are in a conundrum at Activated Intelligence because we earnestly wish to >support open standards well in our XML products, but the XSL working draft >specification requires namespace support that apparently cannot be >implemented effectively if the primary input source is a dynamically built >DOM tree. It is impossible to overstate the business value and importance of >DOM trees as a form of input. .... >Surely there must be effective middle ground? I hope our dialog can reveal >it. The problem is that the DOM doesn't have namespace support yet. They couldn't possibly have had it - the DOM went to recommendation before namespaces were done. The DOM people know this is a problem and will have namespace support in their level 2. It would be nice if this were here today, but it isn't. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Feb 10 22:13:54 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:58 2004 Subject: The Peace Process: DOM and namespaces... In-Reply-To: <36C2011B.D188E44B@activated.com> References: <36C2011B.D188E44B@activated.com> Message-ID: <14018.952.766576.431128@localhost.localdomain> Rick Ross writes: > Our XSL processor is designed to work with any XML parser that > implements DOM and SAX support - a fantastic benefit of reliance on > open standards. Unfortunately, if the DOM api is not rich enough > to support namespaces in XML effectively, then DOM becomes a > second-rate interface for XML/XSL application solutions. Although people would like more elegant solutions, there is no reason not to do namespace processing on the document before offering it through a DOM interface -- the names will not be XML names, but everything should still work. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Wed Feb 10 22:26:34 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:08:58 2004 Subject: what is "completely different" [Re: RDF, Namespaces, and Vers ioning?] Message-ID: <5BF896CAFE8DD111812400805F1991F708AAEF7D@RED-MSG-08> I concur with David Megginson's mail of Feb. 10. Namespaces serve to globally disambiguate names. They do not define the syntax or semantics of the named things: that is the business of various kinds of defining documents, including schemas. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rick at activated.com Wed Feb 10 22:33:02 1999 From: rick at activated.com (Rick Ross) Date: Mon Jun 7 17:08:58 2004 Subject: The Peace Process: DOM and namespaces... References: <3.0.32.19990210140521.00bde160@pop.intergate.bc.ca> Message-ID: <36C20910.332BF6FE@activated.com> Tim, This is definitely the problem, but the solution might then lie in not defining an intrinsic mismatch between the first official XSL spec and the existing XML 1.0 and DOM Level 1? As was mentioned on the list the other day, XML 1.0 doesn't support namespaces either - and it is abundantly clear that there is huge controversy within the community about the namespaces in XML recommendation. Perhaps it is simply too early to require namespace support in XSL at all? Perhaps, instead, the initial XSL spec should deliberately be designed to mesh well with the existing and prevailing implementations of XML and DOM. DOM Level 2 is a Pandora's box, and to support it will require a massive amount of effort by comparison to DOM Level 1. Why saddle XSL with this burden now? It probably just isn't the right time. If it is a good idea, then it will remain a good idea when these other key specs have matured enough to support it? When XML and DOM have advanced it will be much more appropriate to revisit this namespaces in XSL issue. To discard DOM now is bad for the emerging technology and bad for the business applications that will be the drivers of its economic viability. I really hope this is not contentious - I want to see the best result emerge - as I believe the majority of list readers do, whatever that result may be? Regards, Rick Tim Bray wrote: > > .... > >Surely there must be effective middle ground? I hope our dialog can reveal > >it. > > The problem is that the DOM doesn't have namespace support yet. They > couldn't possibly have had it - the DOM went to recommendation before > namespaces were done. The DOM people know this is a problem and will > have namespace support in their level 2. It would be nice if this > were here today, but it isn't. > > -Tim -------------- next part -------------- A non-text attachment was scrubbed... Name: rick.vcf Type: text/x-vcard Size: 311 bytes Desc: Card for Rick Ross Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990210/099838c0/rick.vcf From rick at activated.com Wed Feb 10 22:45:17 1999 From: rick at activated.com (Rick Ross) Date: Mon Jun 7 17:08:58 2004 Subject: The Peace Process: DOM and namespaces... References: <36C2011B.D188E44B@activated.com> <14018.952.766576.431128@localhost.localdomain> Message-ID: <36C20BEC.FEDBA22D@activated.com> Doesn't that presume that a "document" exists? My principal problem is that we will very often deal with DOM tree representations that ARE the document (for all practical purposes). The business applications we envision will not be spending time committing data to XML files or streams and then reparsing it - rather they will operate most frequently on dynamically constructed DOM trees, produced programmatically from databases and other sources. Object models are critical for high-end performance. There should not be a requirement for the data to get emitted as XML, just so it can be reparsed - the XML-implied DOM representation would suffice beautifully if not for this namespace problem. Regards, Rick David Megginson wrote: > > Rick Ross writes: > > > Our XSL processor is designed to work with any XML parser that > > implements DOM and SAX support - a fantastic benefit of reliance on > > open standards. Unfortunately, if the DOM api is not rich enough > > to support namespaces in XML effectively, then DOM becomes a > > second-rate interface for XML/XSL application solutions. > > Although people would like more elegant solutions, there is no reason > not to do namespace processing on the document before offering it > through a DOM interface -- the names will not be XML names, but > everything should still work. > > All the best, > > David > > -- > David Megginson david@megginson.com > http://www.megginson.com/ > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Wed Feb 10 22:50:25 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:58 2004 Subject: The Peace Process: DOM and namespaces... References: <36C2011B.D188E44B@activated.com> <14018.952.766576.431128@localhost.localdomain> Message-ID: <36C20CD1.9A206D5D@infinet.com> David Megginson wrote: > Rick Ross writes: > > > Our XSL processor is designed to work with any XML parser that > > implements DOM and SAX support - a fantastic benefit of reliance on > > open standards. Unfortunately, if the DOM api is not rich enough > > to support namespaces in XML effectively, then DOM becomes a > > second-rate interface for XML/XSL application solutions. > > Although people would like more elegant solutions, there is no reason > not to do namespace processing on the document before offering it > through a DOM interface -- the names will not be XML names, but > everything should still work. Then the document is illegal. Namespaces can essentially be any set of characters you want. When you replace the prefix with a namespace, you are creating an illegal XML Name as you already stated. Should the DOM reflect a legal XML document or should the DOM allow anything you want to serve as element and attribute names. Last but not least, in XSL if you use this approach you suggested for expanding QNames, then within ElementExpressions and AttributeExpressions, you will in essence run into problems with parsing these expressions as characters like '@' and '/' and '|' are significant and will screw up the entire PathExpression you are parsing. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Feb 10 23:00:30 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:59 2004 Subject: The Peace Process: DOM and namespaces... In-Reply-To: <36C20CD1.9A206D5D@infinet.com> References: <36C2011B.D188E44B@activated.com> <14018.952.766576.431128@localhost.localdomain> <36C20CD1.9A206D5D@infinet.com> Message-ID: <14018.3364.167089.666427@localhost.localdomain> Tyler Baker writes: > Then the document is illegal. How? The DOM view of the document does not affect the document itself. > Namespaces can essentially be any set of characters you want. When > you replace the prefix with a namespace, you are creating an > illegal XML Name as you already stated. Should the DOM reflect a > legal XML document or should the DOM allow anything you want to > serve as element and attribute names. The physical representation of an XML document (as defined by XML 1.0) is not allowed to have characters like '/' and '@' in element and attribute name, but the DOM is not a physical representation; it is an API providing access to one view of a document's information set, and as such, it is not governed by the Name production in XML 1.0. There is currently *no* complete specification governing an XML document's information set: it would be quite conformant (though silly) for the DOM to swap uppercase and lower-case in names, to precede every name with "go away or I will taunt you a second time:", to randomly rename elements to "Bob", or just about anything else. The XML 1.0 spec does not even require processors to report element names, so in terms of conformance, anything goes kids. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Feb 10 23:03:45 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:08:59 2004 Subject: The Peace Process: DOM and namespaces... In-Reply-To: <36C20BEC.FEDBA22D@activated.com> References: <36C2011B.D188E44B@activated.com> <14018.952.766576.431128@localhost.localdomain> <36C20BEC.FEDBA22D@activated.com> Message-ID: <14018.3916.726579.503762@localhost.localdomain> Rick Ross writes: [about preprocessing namespaces for the DOM] > Doesn't that presume that a "document" exists? Preprocess your information, whatever its source. > There should not be a requirement for the data to get emitted as > XML, just so it can be reparsed - the XML-implied DOM > representation would suffice beautifully if not for this namespace > problem. I cannot understand why you would have to do this. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Feb 10 23:04:56 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:08:59 2004 Subject: The Peace Process: DOM and namespaces... Message-ID: <3.0.32.19990210150228.00be03c0@pop.intergate.bc.ca> At 05:32 PM 2/10/99 -0500, Rick Ross wrote: >This is definitely the problem, but the solution might then lie in not >defining an intrinsic mismatch between the first official XSL spec and the >existing XML 1.0 and DOM Level 1? The problem won't go away. The W3C specs are built in parallel process by a bunch of different but overlapping groups of people. A bit of thought shows that if everything was put on hold until everything else was finished, you get dead-lock, there are circular dependencies. As a consequence, things lurch forward without (so far) getting too far out of sync. There will be a stable DOM spec that does namespaces some number of months after namespaces was frozen. This gap is unfortunate but not IMHO avoidable. Now that I think about it, there is a good chance that there will be a stable DOM level 2 *before* there's a stable XSL 1.0! It's hardly consistent to complain that unfinished-standard-A can't be used because of the unfished state of standard-B, on which it depends. >As was mentioned on the list the other day, XML 1.0 doesn't support >namespaces either - and it is abundantly clear that there is huge >controversy within the community about the namespaces in XML recommendation. I don't buy that. There is tons of controversy on this mailing list. All the leading implementers - I repeat, *all* the leading implementers - in the world of XML have either already completed their namespace implementation or will very soon. (Hint: it's not hard.) >Perhaps it is simply too early to require namespace support in XSL at all? >Perhaps, instead, the initial XSL spec should deliberately be designed to >mesh well with the existing and prevailing implementations of XML and DOM. That's a suggestion that is perfectly sane on the face of it; why don't you make it formally to the XLS committee? >DOM Level 2 is a Pandora's box Really? I didn't know that. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From keshlam at us.ibm.com Wed Feb 10 23:31:52 1999 From: keshlam at us.ibm.com (keshlam@us.ibm.com) Date: Mon Jun 7 17:08:59 2004 Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?) Message-ID: <85256714.008116FD.00@D51MTA03.pok.ibm.com> <Delurk> I think folks are reading much more into the DOM than they should. Can we step back from religion to programming practice for a moment? The DOM is an API for random access to structured documents. It is only an API. It may be wrapped around any back-end storage representations you consider appropriate. If you have a random-access model of your document, putting a DOM interface on it gives folks a standard way of accessing it. Note that the DOM is defined in terms of interfaces rather than classes; there doesn't have to be a 1:1 mapping between the two, as long as when folks ask for an Element (for example), it behaves like an Element. The fact that it also behaves like a Document, and/or a Swing TreeNode, and/or whatever other behavior your implementation cares to add to it, doesn't matter to the DOM. There are off-the-shelf data models that implement the DOM; my own (which XML4J is moving toward) is one instance thereof. These are offered as a convenience, just as the default models behind Sun's MVC-based Swing widgets are offered as a convenience. If you don't already have a data model with a DOM API, and one of these suits your needs, you can plug it in and run. If an off-the-shelf DOM _doesn't_ do what you need, it may allow you to subclass and extend it. Or you may want to plug in someone else's. Or your own. Using DOM as a standard API around the model gives you the freedom to swap that component without changing your other code. (That theory is hung up on some places where the DOM Level 1 spec is incomplete, but Level 2 should close the gaps.) If your application don't have a random-access model of the document, the DOM isn't relevant. You _can_ use it, but you can also use SAX or other solutions. Pick the approach that suits your needs. A good parser should be able to yield both DOM and SAX, equally smoothly. IBM's XML4J is moving in that direction, though early versions were very DOM-centric. As others have said: DOM performance depends on what kind of model the DOM API is wrapped around. There will not be any single "best" DOM implementation, since different applications have different needs. Some DOMs will specialize in performance, perhaps tuned for particular tasks. Others will specialise in minimal codesize (perhaps for fast download in an applet), or minimal storage use for the document model (for handling large documents in constrained machines). Still others will be wrapped around existing models (databases and so on), provided for compatability with DOM-based application code even if that isn't the best possible way to access this particular model. You really can't make any statements about DOM performance without saying precisely which implementation you're talking about... and as with other software components, you'll pick the one that suits the task you want to solve. The Document Object Model is just a tool -- as is XML, for that matter. Decisions to use or not use it should be made precisely the same way decisions to use or not use XML are made. If it fits your problem, using it gives you a place to plug in other off-the-shelf solutions. If it doesn't, use something else. There's nothing wrong with SAX (though it too needs another turn of the evolutionary crank, in my opinion), but SAX is a stream rather than a model. The two really aren't in competition with each other any more than sed is in competition with vi -- they're each good in their own target domain, and there are even times when using one to generate the other is the right answer. Reality is fractal. Absolutes are almost always false. </Delurk> ______________________________________ Joe Kesselman / IBM Research Unless stated otherwise, all opinions are solely those of the author. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Wed Feb 10 23:39:57 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:59 2004 Subject: The Peace Process: DOM and namespaces... References: <36C2011B.D188E44B@activated.com> <14018.952.766576.431128@localhost.localdomain> <36C20CD1.9A206D5D@infinet.com> <14018.3364.167089.666427@localhost.localdomain> Message-ID: <36C2187E.455D0329@infinet.com> David Megginson wrote: > Tyler Baker writes: > > > Then the document is illegal. > > How? The DOM view of the document does not affect the document itself. The DOM has an unstated implication that it reflects a valid XML document. If you make a call to getNodeName() on an Element node, it is expected to return a valid XML name. > > Namespaces can essentially be any set of characters you want. When > > you replace the prefix with a namespace, you are creating an > > illegal XML Name as you already stated. Should the DOM reflect a > > legal XML document or should the DOM allow anything you want to > > serve as element and attribute names. > > The physical representation of an XML document (as defined by XML 1.0) > is not allowed to have characters like '/' and '@' in element and > attribute name, but the DOM is not a physical representation; it is an > API providing access to one view of a document's information set, and > as such, it is not governed by the Name production in XML 1.0. This is one way of looking at it. But this is not clear and there is no mechanism defined to tell an application whether the DOM is using these illegal names or not. If you write the DOM Document back out to XML, you are writing out illegal names because you don't know if you are writing out prefixes + local part or namespace + local part. > There is currently *no* complete specification governing an XML > document's information set: it would be quite conformant (though > silly) for the DOM to swap uppercase and lower-case in names, to > precede every name with "go away or I will taunt you a second time:", > to randomly rename elements to "Bob", or just about anything else. > > The XML 1.0 spec does not even require processors to report element > names, so in terms of conformance, anything goes kids. How is anyone supposed to reliably build any sort of architecture on XML if everything is this ambiguous. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Wed Feb 10 23:47:43 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:59 2004 Subject: The Peace Process: DOM and namespaces... References: <36C2011B.D188E44B@activated.com> <14018.952.766576.431128@localhost.localdomain> <36C20BEC.FEDBA22D@activated.com> <14018.3916.726579.503762@localhost.localdomain> Message-ID: <36C21A27.69968FDC@infinet.com> David Megginson wrote: > Rick Ross writes: > > [about preprocessing namespaces for the DOM] > > > Doesn't that presume that a "document" exists? > > Preprocess your information, whatever its source. >From an entire database. Pass over the entire document tree and prepreprocess everything before actually presenting it to the application. This is not practical. In my limited experience on these matters I have seen this tried before and with horrendous results. Nevertheless, it does not take a computer scientist to see the real world problem with this approach. > > There should not be a requirement for the data to get emitted as > > XML, just so it can be reparsed - the XML-implied DOM > > representation would suffice beautifully if not for this namespace > > problem. > > I cannot understand why you would have to do this. There is a big difference between "having to do something" and doing something because it makes perfect sense from a real world perspective. Preprocessing the entire source tree just do handle namespaces hardly makes a case for supporting namespaces at all in a product. You can have the most elaborate features in the world for an application but if they don't work in the real-world problem domains of real people, then they are pretty useless no matter what their original intentions were. "Namespaces in XML" and the hacked up manner in which you need to deal with them in XML is a prime example of a specification not realistically trying to solve the problems of its target audience. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Wed Feb 10 23:55:01 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:08:59 2004 Subject: The Peace Process: DOM and namespaces... References: <3.0.32.19990210150228.00be03c0@pop.intergate.bc.ca> Message-ID: <36C21BF4.8F81DC9D@infinet.com> Tim Bray wrote: > At 05:32 PM 2/10/99 -0500, Rick Ross wrote: > > >This is definitely the problem, but the solution might then lie in not > >defining an intrinsic mismatch between the first official XSL spec and the > >existing XML 1.0 and DOM Level 1? > > The problem won't go away. The W3C specs are built in parallel > process by a bunch of different but overlapping groups of people. A > bit of thought shows that if everything was put on hold until > everything else was finished, you get dead-lock, there are > circular dependencies. As a consequence, things lurch forward > without (so far) getting too far out of sync. There will be a stable > DOM spec that does namespaces some number of months after namespaces > was frozen. This gap is unfortunate but not IMHO avoidable. > > Now that I think about it, there is a good chance that there will > be a stable DOM level 2 *before* there's a stable XSL 1.0! It's > hardly consistent to complain that unfinished-standard-A can't > be used because of the unfished state of standard-B, on which > it depends. > > >As was mentioned on the list the other day, XML 1.0 doesn't support > >namespaces either - and it is abundantly clear that there is huge > >controversy within the community about the namespaces in XML recommendation. > > I don't buy that. There is tons of controversy on this mailing list. > All the leading implementers - I repeat, *all* the leading implementers - > in the world of XML have either already completed their namespace > implementation or will very soon. (Hint: it's not hard.) It is not hard to support namespaces in an XML parser, but it is far harder to deal with them at the application level. Just because all the tools support namespaces, does not mean that anyone is using them outside of a few niche applications. Like another list member recently said "where is the content"? > >Perhaps it is simply too early to require namespace support in XSL at all? > >Perhaps, instead, the initial XSL spec should deliberately be designed to > >mesh well with the existing and prevailing implementations of XML and DOM. > > That's a suggestion that is perfectly sane on the face of it; why > don't you make it formally to the XLS committee? > > >DOM Level 2 is a Pandora's box > > Really? I didn't know that. I think he means that it is huge and still a working draft. I have been told a lot of the stuff that makes it huge are optional features, so in that sense it may indeed be a "Pandora's Box". OpenDoc largely failed because it was so huge and no one could justify the development dollars to spend on supporting it. When you create specs that are huge and take tons of time to implement, you are indirectly costing a lot of people a lot of money. After all, someone has to foot the bill for commercial software unless you expect XML to solely thrive on freeware in the future. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Marc.McDonald at Design-Intelligence.com Thu Feb 11 00:06:47 1999 From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com) Date: Mon Jun 7 17:08:59 2004 Subject: Possible enhancements to Namespaces to allow valiadation Message-ID: <c=US%a=_%p=Design_Intellige%l=MASTER-990211000549Z-3862@master.design-intelligence.com> I've been following this latest conversation on namespaces and validation. My understanding of namespaces is as follows: 1. An element can define a namespace prefix which can be used to distinguish element and attribute names via the xmlns:prefix="URI" attribute. The prefixes are containment-scoped. 2. An element can defined a containment-scoped default namespace using the xmlns="URI" attribute. 3. A URI is just a globally unique identifier for the namespace. It need not correspond to any DTD. 4. An element or attribute name without a prefix is interpreted as belonging to the default or to no namespace if a default has not been declared. 5. An element or attribute name can have a prefix, prefix:localname, which indicates its namespace 6. The prefixed form, a qualified name, is just a name to an XML parser (':' is a legal name character) 7. Qualified names are identical if they refer to the same URI and localname. 8. The method of processing namespaces is not defined by the recommendation. A document using namespaces can be conformance parsed using any existing XML parser since any qualified names would appear to be normal names and attribute declarations aren't validated so there are no problems with any xmlns[:prefix] attributes. However, validity parsing implies a number of changes: 1. Qualified names must be converted into unique names involving the URIs associated with the namespaces. Different prefixes can refer to the same URI. 2. Any DTDs containing elements referred to by a qualified name must be modified to use qualified names themselves. 3. A DTD must be constructed for the document, this can be difficult. A solution discussed in the thread involves creating a new document and associated DTDs with these changes made and then doing normal XML validation. The DTD against which the document is validated is can easy to construct because it matches the current document in which case the only elements being validated are the elements from namespaces. On the other hand construction can be quite complex due trying to synthesize the more general combination of namespaces from a single example document. I would argue any process that involves the modification of existing documents and DTDs is flawed. As has been mentioned in another thread, modification of 50 megabyte documents and megabyte DTDs can involve significant space and CPU time. I would suggest the following changes to the namespace recommendation to address some of these problems: 1. Allow a DTD to declare the namespaces and prefixes it uses. 2. Allow a DTD to declare its default namespace (itself). This removes the need to prefix elements and attributes in the DTD. 3. Allow a namespace declaration to optionally include a URL for a DTD, just as documents do. 4. Enhance XML validation. When an element is not in the current namespace, look at the namespace declaration. If it does not have an associated DTD, only do well-formed parsing of the element. If it does have an associated DTD, validate the element against the associated DTD with the new namespace as the default. This would nest on each use of a different namespace Note: Change 2 is actually optional in view of change 3 since the latter would set the default namespace to the namespace of the DTD. It would be used to declare the standard URI and prefix for a namespace in its associated DTD and part of validity checking could be to match URIs. I would punt the problem of constructing the DTD for the document. Only a human would be able to decide which elements from which namespaces can be where. If elements can be anywhere, the value of validation is vastly reduced. The only elements validated would be those of different namespaces. This is akin to using ANY in every element declaration, so much is made legal that little is invalid. Marc McDonald Principal Software Scientist Design Intelligence Inc. www.design-intelligence.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Thu Feb 11 00:14:17 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:08:59 2004 Subject: The Peace Process: DOM and namespaces... Message-ID: <00c101be5553$43373290$2ee044c6@arcot-main> Rick, I am in complete agreement without regarding the use of DOM documents as input to XSL processors. However, I think we should revisit what the conflicts are specifically. Could you point out what aspect of the XSL or the namespace spec forces DOM tree to be rebuilt over and over? As far as expanded DOM names being illegal XML names, one possible option is to establish a business practice requiring all URI used as namespace name to not contain any illegal XML name characters. After all, their meanings are undefined and they are only required to be universally unique. BTW, I have removed XSL mailing list from this thread because this issue affects more than XSL. Best, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rschoening at unforgettable.com Thu Feb 11 00:20:58 1999 From: rschoening at unforgettable.com (Rob Schoening) Date: Mon Jun 7 17:08:59 2004 Subject: The Peace Process: DOM and namespaces... In-Reply-To: <36C21BF4.8F81DC9D@infinet.com> References: <36C21BF4.8F81DC9D@infinet.com> Message-ID: <0003438b96feca7f_mailit@mail.iname.com> >Tim Bray wrote: >It is not hard to support namespaces in an XML parser, but it is far harder >to deal with them >at the application level. Just because all the tools support namespaces, >does not mean that >anyone is using them outside of a few niche applications. Like another list >member recently >said "where is the content"? Exactly. The real problem here is that is is difficult to implement these emerging specs in a consistent manner. The consequence is that the application level apps that are developed in the meantime are going to jerry-rig support for namespaces. By the time there is parity between DOM and the parsers, application developers are going to have to back up and retool. In certain cases (I'm thinking of Microsoft in particular) that may never happen. Thus we end up with a set of badly fractured de-facto standards. This is what I meant when I suggested that XML risks becoming just another file format. If the application-level integration does not evolve with the parser-level integration, the overall utility of the XML universe will be greatly diminished. We'll have a oh-so-elegant file format with non-standard tools to use it in applications. The irony is that that is the hallmark of any proprietary file format. The only way that I see that this can be avoided is some careful thought placed on staging of the lifecycle of the XML collective. Rob xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Thu Feb 11 00:23:15 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:08:59 2004 Subject: The Peace Process: DOM and namespaces... Message-ID: <5BF896CAFE8DD111812400805F1991F708AAEF84@RED-MSG-08> Don Park suggests that we could "establish a business practice requiring all URI used as namespace name to not contain any illegal XML name characters." That rules out "/", thereby limiting namespaces to second-tier domain names and similar unsegmented names. That goes against the desire for unlimited scalability and the goal to allow _anyone_ to create namespace names as they need to. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Thu Feb 11 01:00:47 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:08:59 2004 Subject: The Peace Process: DOM and namespaces... Message-ID: <00f001be5559$c00c8990$2ee044c6@arcot-main> Andrew, >Don Park suggests that we could "establish a business practice requiring all >URI used as namespace name to >not contain any illegal XML name characters." That rules out "/", thereby >limiting namespaces to second-tier domain names and similar unsegmented >names. That goes against the desire for unlimited scalability and the goal >to allow _anyone_ to create namespace names as they need to. I guess what I am proposing is a creation of a new URI scheme for XML Namespace that is more friendly to namespace-ignorant applications. The URI production rules are defined as: uri ::= scheme : path [ ? search ] We can use 'xmlns' as the URI scheme name, outlaw the optional search segment, and then use Java package naming scheme for the path. Here is an example: <dh:foo xmlns:dh="xmlns:com.docuverse.dom.html." dh:bar="foobar"/> which expands to: <xmlns:com.docuverse.dom.html.foo xmlns:com.docuverse.dom.html.bar="foobar"/> I don't think this proposal goes against the desire and goals you mentioned. Please remember that I am just shooting from the hip right now and have not yet formed any opinion about whether this solution will be good or not. Comments? Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Thu Feb 11 04:58:22 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:08:59 2004 Subject: The Peace Process: DOM and namespaces... Message-ID: <004401be557a$829dbfe0$c9a8a8c0@thing2> From: Don Park <donpark@quake.net> >We can use 'xmlns' as the URI scheme name, outlaw the optional search >segment, and then use Java package naming scheme for the path. Don, I think there is an easier answer: Convert URLs to package names. For example, www.jxml.com/mdsax becomes com.jxml.www.mdsax We can still allow considerable freedom in the format of the URI. And yes, there is a risk of introducing non-uniqueness. But a reasonable choice of URI eliminates this risk. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Thu Feb 11 05:09:36 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:00 2004 Subject: Roll-Your-Own Parsers (was: Re: What Clean Specs Achieve) Message-ID: <004b01be557c$15d2e140$c9a8a8c0@thing2> From: Simon St.Laurent <simonstl@simonstl.com> >If we could work with parser layers rather than parsers, this might become >a lot easier to manage. We could just turn on the parts we need and turn >off the ones we don't. We've been continuing our effort to add filters to MDSAX and in the process ran into a small problem--the runtime was loading all the filters specifiec in the context boot document. The next release fixes this, so that only the filters actually being used get loaded. Anyway, the point here is that a reasonable approach to dynamic configuration should give you a very small footprint when only selected features are used. This in contrast to a does-everything monolithic parser. And as the specs mature, it seems likely that the larger companies represented at the W3C have no reason at all to keep the specs small and lightweight--cumbersom specs nicely eliminates much of the competition! So it seems the choice is clear... expect your parsers to grow till they have more features than MS Word, or take a configurable approach which allows the developer to select the capabilities necessary for the job at hand. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Thu Feb 11 05:12:57 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:09:00 2004 Subject: The Peace Process: DOM and namespaces... References: <36C2011B.D188E44B@activated.com> Message-ID: <36C26100.8C18CB20@jclark.com> Rick Ross wrote: > the XSL working draft > specification requires namespace support that apparently cannot be > implemented effectively if the primary input source is a dynamically built > DOM tree. I can't see this. Why can't you put a layer on top of the DOM that provides namespace processing? For example, you could have an NSNode object that points to the DOM Node and a set of prefix bindings (and probably a parent NSNode). The NSNode objects will be temporary. You wouldn't have to reparse the document, and you don't have to keep two trees in memory. You can also provide other things in this layer that help XSL performance such as document order comparison. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Feb 11 05:17:35 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:09:00 2004 Subject: The Peace Process: DOM and namespaces... References: <00c101be5553$43373290$2ee044c6@arcot-main> Message-ID: <36C267A0.3B7F0FD3@infinet.com> Don Park wrote: > Rick, > > I am in complete agreement without regarding the use of DOM documents as > input to XSL processors. > > However, I think we should revisit what the conflicts are specifically. > Could you point out what aspect of the XSL or the namespace spec forces DOM > tree to be rebuilt over and over? As far as the source tree is concerned, since there is no native namespace support, any mutation of the DOM tree in effect invalidates the rest of the tree in terms of managing namespace scoping. This would not be a problem if the DOM did all of this internally and as in the case with Oracle's DOM namespace specific types that extend the Element and Attribute interfaces (NSElement and NSAttribute). Each node manages its "namespace" internally. If the tree is mutated, then the necessary internal changes to managing prefixes, namespaces, etc. are automatic. However, if you rely on the standard DOM, you never know what the scoping rules are (since they can always change if the tree is mutated only slightly). This forces you to interate the entire source tree to build up the necessary Node -> namespace mappings or else lazily evaluate namespaces for each node being processed or evaluated in a select or match pattern. LotusXSL does this and it is horribly inefficient. XT on the other hand does not use the DOM at all so all of the namespaces are resolved as the source tree is being built. There are lots of other issues, but basically you are left with either using the DOM which is the only standard object model and have performance issues to handle namespaces, or else you don't support any standard object model at all and make everyone use your proprietary source tree interface. If you get rid of namespaces as they are currently defined in XSL, you can have your cake and eat it too. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Feb 11 05:21:15 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:09:00 2004 Subject: The Peace Process: DOM and namespaces... References: <5BF896CAFE8DD111812400805F1991F708AAEF84@RED-MSG-08> Message-ID: <36C26833.8ADBFEF7@infinet.com> Andrew Layman wrote: > Don Park suggests that we could "establish a business practice requiring all > URI used as namespace name to > not contain any illegal XML name characters." That rules out "/", thereby > limiting namespaces to second-tier domain names and similar unsegmented > names. That goes against the desire for unlimited scalability and the goal > to allow _anyone_ to create namespace names as they need to. Interesting that the word scalability is used here. The added complexity namespaces will add to IS infrastructure I think will cause scalability problems in terms of managing all of this data in a clean way. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Feb 11 05:27:04 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:09:00 2004 Subject: The Peace Process: DOM and namespaces... References: <00f001be5559$c00c8990$2ee044c6@arcot-main> Message-ID: <36C269DB.F846A438@infinet.com> Don Park wrote: > Andrew, > > >Don Park suggests that we could "establish a business practice requiring > all > >URI used as namespace name to > >not contain any illegal XML name characters." That rules out "/", thereby > >limiting namespaces to second-tier domain names and similar unsegmented > >names. That goes against the desire for unlimited scalability and the goal > >to allow _anyone_ to create namespace names as they need to. > > I guess what I am proposing is a creation of a new URI scheme for XML > Namespace that is more friendly to namespace-ignorant applications. The URI > production rules are defined as: > > uri ::= scheme : path [ ? search ] > > We can use 'xmlns' as the URI scheme name, outlaw the optional search > segment, and then use Java package naming scheme for the path. Here is an > example: > > <dh:foo xmlns:dh="xmlns:com.docuverse.dom.html." dh:bar="foobar"/> > > which expands to: > > <xmlns:com.docuverse.dom.html.foo > xmlns:com.docuverse.dom.html.bar="foobar"/> I like this idea except that your chances of the W3C adopting it are likely slim and none, so is any idea of splitting XML 1.0 and "Namespaces in XML" into two different types of documents so that applications are free to choose which method they want to use in their applications without having to worry about handling the ambiguities that arise from combining XML 1.0 data and "Namespaces in XML" data together all at once. A clean break between the two specifications I think would be good for XML. It would end the namespaces issue as people who want and need to use "Namespaces in XML" will use "Namespaces in XML" and those who don't want to be bogged down with the complexities of "Namespaces in XML" or would prefer to use a different method of managing unique names could choose to do so. I just feel like some people have the attitude that I will be using "Namespaces in XML" whether I like it or not. Please, lets formally split these two obviously different document formats for the good of XML and end this namespaces issue right now. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Feb 11 05:37:11 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:09:00 2004 Subject: The Peace Process: DOM and namespaces... References: <36C2011B.D188E44B@activated.com> <36C26100.8C18CB20@jclark.com> Message-ID: <36C26C30.6436AB4B@infinet.com> James Clark wrote: > Rick Ross wrote: > > > the XSL working draft > > specification requires namespace support that apparently cannot be > > implemented effectively if the primary input source is a dynamically built > > DOM tree. > > I can't see this. > > Why can't you put a layer on top of the DOM that provides namespace > processing? For example, you could have an NSNode object that points to > the DOM Node and a set of prefix bindings (and probably a parent > NSNode). The NSNode objects will be temporary. You wouldn't have to > reparse the document, and you don't have to keep two trees in memory. > You can also provide other things in this layer that help XSL > performance such as document order comparison. This is a familiar design model when you want to add functionality to a business object without formally extending it. I have looked into this myself, but it still poses problems, namely you need to either build this entire map from iterating over the entire DOM tree and writing code like this for every node in the document: In other words, been there done that. NamedNodeMap attributes = node.getAttributes(); if (attributes != null) { int length = attributes.getLength(); Node attribute; String nodeName; for (int i = 0; i < length; i++) { attribute = attributes.item(i); nodeName = attribute.getNodeName(); if (attribute.getNodeName().equals("xmlns")) { // Do namespace default processing } else if (attribute.getNodeName().startsWith("xmlns:") { // Do namespace processing } } } In terms of O notation (I hate to be academic here) that is N^2 plus the underlying cost of string comparisons. I cannot believe people see the obvious flaws with this approach. It is significant. And you still need to reparse the entire document each time to safely create these bindings as the source tree can mutate at any time. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Feb 11 05:42:07 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:09:00 2004 Subject: The Peace Process: DOM and namespaces... References: <004401be557a$829dbfe0$c9a8a8c0@thing2> Message-ID: <36C26D60.B53028A6@infinet.com> Bill la Forge wrote: > From: Don Park <donpark@quake.net> > >We can use 'xmlns' as the URI scheme name, outlaw the optional search > >segment, and then use Java package naming scheme for the path. > > Don, I think there is an easier answer: > > Convert URLs to package names. > > For example, > www.jxml.com/mdsax > becomes > com.jxml.www.mdsax > > We can still allow considerable freedom in the format of the URI. > And yes, there is a risk of introducing non-uniqueness. But a > reasonable choice of URI eliminates this risk. But then again, why would you need namespaces at all for this since package names are intended to be unique. All you would need to do is follow a convention that your XML names would be "package" + "type". Of course that is a framework I wrote about a year ago that works nicely and is far simpler and more maintainable than "Namespaces in XML" and that idea took about 10 minutes and not an entire year. Of course you have been doing a lot of exciting things with Coins and MDSAX that deal with the "unique names issue" and the W3C to my knowledge does not even know you exist (-: Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Feb 11 05:45:30 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:09:00 2004 Subject: Roll-Your-Own Parsers (was: Re: What Clean Specs Achieve) References: <004b01be557c$15d2e140$c9a8a8c0@thing2> Message-ID: <36C26DF2.C6018141@infinet.com> Bill la Forge wrote: > From: Simon St.Laurent <simonstl@simonstl.com> > >If we could work with parser layers rather than parsers, this might become > >a lot easier to manage. We could just turn on the parts we need and turn > >off the ones we don't. > > We've been continuing our effort to add filters to MDSAX and in the process > ran into a small problem--the runtime was loading all the filters specifiec in the context > boot document. The next release fixes this, so that only the filters actually being > used get loaded. > > Anyway, the point here is that a reasonable approach to dynamic configuration > should give you a very small footprint when only selected features are used. > This in contrast to a does-everything monolithic parser. And as the specs > mature, it seems likely that the larger companies represented at the W3C > have no reason at all to keep the specs small and lightweight--cumbersom > specs nicely eliminates much of the competition! > > So it seems the choice is clear... expect your parsers to grow till they have more > features than MS Word, or take a configurable approach which allows the developer > to select the capabilities necessary for the job at hand. Sad but true, but this is not the parser writer's fault as we are almost completely obligated to support everything in a W3C recommendation to call ourselve's compliant. I am wondering if you will be able to call an XML framework compliant if it elects to omit namespaces support or any future technical headache that is thrown at us developers. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Feb 11 05:58:13 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:00 2004 Subject: The Peace Process: DOM and namespaces... Message-ID: <3.0.32.19990210215632.00c77ac0@pop.intergate.bc.ca> At 12:40 AM 2/11/99 -0500, Tyler Baker wrote: >Of course that is a framework I wrote about a year ago that works nicely and is far simpler >and more maintainable than "Namespaces in XML" and that idea took about 10 minutes and not an >entire year. We should all feel privileged to be favored with the presence of such greatness here on xml-dev. We're unworthy. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jayadeva at lgsi.co.in Thu Feb 11 06:45:50 1999 From: jayadeva at lgsi.co.in (Jayadeva Babu Gali) Date: Mon Jun 7 17:09:00 2004 Subject: dynamically XML Message-ID: <36C27CEB.66B9D8AB@lgsi.co.in> Hi, Can any body help me how to generate XML files dynamically from the data in the existing database. Any URL's related this one please send me. regds...jayadev xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Feb 11 06:50:42 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:09:00 2004 Subject: The Peace Process: DOM and namespaces... References: <3.0.32.19990210215632.00c77ac0@pop.intergate.bc.ca> Message-ID: <36C27D72.A1244189@infinet.com> Tim Bray wrote: > At 12:40 AM 2/11/99 -0500, Tyler Baker wrote: > >Of course that is a framework I wrote about a year ago that works nicely and is far simpler > >and more maintainable than "Namespaces in XML" and that idea took about 10 minutes and not an > >entire year. > > We should all feel privileged to be favored with the presence of such > greatness here on xml-dev. We're unworthy. -T. The point here is that dealing with namespaces seems to have all of the efficiency of a political process and not that which is based upon sound technical judgements. The fact that it took me a short while to come up with a solution is because I did not go out of my way to come up with some weird complex solution that only I and a few others could understand. Tim you seem to act as if my comments about "Namespaces in XML" are directed at you. This is not the case. I have very little idea of how decisions actually get made in the W3C (as do most people) and am much more upset at the W3C's apparent unwillingness to become a more open institution than anything specific to the "Namespaces in XML" issue. "Namespaces in XML" is a failure IMHO because of a broken process, not because there are not enough smart people at the W3C (there are plenty of those if you look at the profiles of the people who work there). All I know is you are one of the editor's. For all I know your real opinions on how "Namespaces in XML" should of been crafted may be never really known as an editor in effect is supposed to be a neutral party in any standards process. I don't think developers should have to fork over $5,000 a year and $50,000 if you want a real voice, just to participate in web standards. Yah someone has to foot the bill, but I think the W3C would collect a lot more money if it had a more reasonable membership cost for individuals. Even then it might not matter as I have heard that when it comes down to it the $50,000 a year members have a lot more weight than the $5,000 members. Of course these are only rumors as the W3C is pretty effective at keeping discussions as secret as the Star Chamber. To date I have heard all sorts of cons for "Namespaces in XML" but none of the pros other than a few hypothetical arguments where using the current "Namespaces in XML" recommendation might be useful (the keyword is "might" not "is"). Perhaps there is little content to back up "Namespaces in XML" on its technical merits, or else those who think it is a triumph for the W3C don't care enough about it to actually present to us a good whitepaper on when it is appropriate to use "Namespaces in XML". The only exception I can think of is Jim and David (who I consider leaders in this community in addition to yourself) have done everyone a great favor by taking the time to explain in as real-world a way as possible what "Namespaces in XML" is. Nevertheless, I have not heard any clear argument for what "Namespaces in XML" should be used for. Why should us developers be forced to swallow an internet standard we don't agree with just to be interoperable with an internet standard we do agree with. On another note, your recent ad hominem attacks (including this one) that you are displaying only hurt yourself here. All you are doing is attacking someone who despite your vitriol still have enthusiastic respect for your work on the XML 1.0 recommendation. Like your previous sarcasm directed to me personally I will continue to ignore it as lowering myself to your current standard of technical debate does not do anyone any good. I must stress that I still have great respect for you as a developer right now but my respect for you as a person I hate to admit is waning. I doubt you have any respect for me as a developer or as a person though from your comments, which I consider unfortunate but it is something I can live with. Last but not least, it is very difficult for me personally to justify to myself volunteering any more of my time to these discussions if the so-called leaders take technical arguments personally and then react by using the "ad hominem" approach to detract from defending the arguments in debate. This suggests to me that you know in your heart that "Namespaces in XML" is flawed and the only way you can convince people it is not is to trash the critics of the "Namespaces in XML" recommendation (this includes me as well as Murray Muloney and perhaps others). Regards, Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SMUENCH at us.oracle.com Thu Feb 11 07:18:22 1999 From: SMUENCH at us.oracle.com (Steve Muench) Date: Mon Jun 7 17:09:01 2004 Subject: dynamically XML Message-ID: <199902110718.XAA24297@mailsun3> If the existing database is Oracle, you can check out some free utilities and demos for doing just this at: http://www.oracle.com/xml/plsxml/index.html ____________________________________________ Steve Muench, Consulting PM & XML Evangelist Java Business Objects Dev't Team http://www.oracle.com/xml -------------- next part -------------- An embedded message was scrubbed... From: jayadeva@lgsi.co.in (Jayadeva Babu Gali) Subject: dynamically XML Date: 10 Feb 99 22:47:07 Size: 2261 Url: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990211/bff23d3e/attachment.eml From jjc at jclark.com Thu Feb 11 07:53:24 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:09:01 2004 Subject: The Peace Process: DOM and namespaces... References: <36C2011B.D188E44B@activated.com> <36C26100.8C18CB20@jclark.com> <36C26C30.6436AB4B@infinet.com> Message-ID: <36C28949.D23A8262@jclark.com> Tyler Baker wrote: > > James Clark wrote: > > > Rick Ross wrote: > > > > > the XSL working draft > > > specification requires namespace support that apparently cannot be > > > implemented effectively if the primary input source is a dynamically built > > > DOM tree. > > > > I can't see this. > > > > Why can't you put a layer on top of the DOM that provides namespace > > processing? For example, you could have an NSNode object that points to > > the DOM Node and a set of prefix bindings (and probably a parent > > NSNode). The NSNode objects will be temporary. You wouldn't have to > > reparse the document, and you don't have to keep two trees in memory. > > You can also provide other things in this layer that help XSL > > performance such as document order comparison. > > This is a familiar design model when you want to add functionality to a business object > without formally extending it. I have looked into this myself, but it still poses problems, > namely you need to either build this entire map from iterating over the entire DOM tree and > writing code like this for every node in the document: ... > In terms of O notation (I hate to be academic here) that is N^2 plus the underlying cost of > string comparisons. Huh? The time for each element is linear in the number of attributes that the element has. > And you still need to reparse the entire document each time to safely create these bindings as > the source tree can mutate at any time. You need to do the namespace processing each time. You don't have to do the parsing each time. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tony.mcdonald at ncl.ac.uk Thu Feb 11 09:38:13 1999 From: tony.mcdonald at ncl.ac.uk (Tony McDonald) Date: Mon Jun 7 17:09:01 2004 Subject: Word and XML (was: XML standards coherency and so forth) In-Reply-To: <0F7119373E69D2118F6F00805FE68639BEAD67@gren-exch-rcvy.kpscal.org> Message-ID: <v04103d0fb2e8477fad98@[128.240.198.13]> >> From: "Rick Jelliffe" <ricko@allette.com.au> >> Date: Sun, 24 Jan 1999 16:15:36 +1100 >> Subject: Re: Word and XML (was: XML standards coherency and so forth) >> >> From: Biron,Paul V <Paul.V.Biron@kp.ORG> > > [snip] > Wow! I've been so busy lately that I haven't been able to keep up with > XML-DEV and had no idea my "innocent" post on Word and HTML/XML had been so > long lived! > > [snip] > > In truth, we've spent a great deal of time writting tools (a big daisy chain > of FrontPage v1.1 -> hand-roled perl script 1 -> hand-roled perl script 2 -> > etc.) just to HTML output from Word '97. What has made this all the more > fustrating for us is that the HTML is not really what we want in the end. > We just want a "clean" HTML version so that the transformation to the XML > DTD that we're interested in is "easier". The BOLD and ITALIC that our > authors see actually represent more "semantic" XML elements, e.g., <allergy> > and <medication>. Such is life. I don't know how far down this route you've gone Byron, but can I suggest using rtf2xml (http://www.sesha.com/omlette/rtf2xml/) - it uses the limited version of Omnimark http://www.omnimark.com as an engine and does a very good job of RTF -> XML conversion. It uses Word paragraph and character styles to convert the RTF into well-formed and valid XML, eg <p stylename="List Bullet" color="1"><pntext>·&tab;</pntext><string color="1">Almanack & Administration Information </string><string charstyname="URL" fontsize="20" italic="on" color="1">http://nme.ncl.ac.uk/almanack/</string><string color="1"> </string></p> (you can see that additional, formatting, information that was in the original Word document is provided too). I then pass this through another omnimark program to get to (be aware that it's perfectly possible to create invalid and badly-formed XML at this stage!!); ... <subsubsection> <titleinfo class='subsubsection' level='3'> <title class='subsubsection'>On-line Resources Organisation of Tissues Student Support and Tutoring (Computer Mediated Communication) Tools: ... Almanack & Administration Information http://nme.ncl.ac.uk/almanack/ ... >From this XML, the conversion to another HTML (or RTF etc.) format is (relatively) easy. I tried using the 'HTML' that Word 'emits' and had to have a lie down...this scheme of using RTF and well marked up original documents seems to be helping us along in our up-conversion process (whoever chose that term knew what they were talking about - it's like climbing, rather inching up, a vertical cliff face going backwards with no ropes...great fun) hth tone ------ Dr Tony McDonald, FMCC, Networked Learning Environments Project The Medical School, Newcastle University Tel: +44 191 222 5888 Fingerprint: 3450 876D FA41 B926 D3DD F8C3 F2D0 C3B9 8B38 18A2 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Thu Feb 11 10:08:27 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:09:01 2004 Subject: Possible enhancements to Namespaces to allow valiadation References: Message-ID: <36C2AD5C.1CF9A26E@mecomnet.de> Marc.McDonald@Design-Intelligence.com wrote: > > ... > 4. An element or attribute name without a prefix is interpreted as > belonging to the default or to no namespace if a default has not been > declared. this one is imprecise as worded. i'll leave it to others to refine it... :) > > A solution discussed in the thread involves creating a new document > and associated DTDs with these changes made and then doing normal XML > validation. > ... > I would suggest the following changes to the namespace recommendation > to address some of these problems: > 1. Allow a DTD to declare the namespaces and prefixes it uses. a method to do this with pi's has been demonstrated. it should also be possible to do this with attributes' default declarations. in that case, similar scoping rules could be inferred from the content models. so long as there were no contradictions implied, this would work for your 50M files. > 2. Allow a DTD to declare its default namespace (itself). This removes > the need to prefix elements and attributes in the DTD. see above. given a default declaration on a "root" element, the remainder could be inferred. > 3. Allow a namespace declaration to optionally include a URL for a > DTD, just as documents do. no comment. > 4. Enhance XML validation. When an element is not in the current > namespace, look at the namespace declaration. If it does not have an > associated DTD, only do well-formed parsing of the element. If it does > have an associated DTD, validate the element against the associated > DTD with the new namespace as the default. i don't see why this is necessary. wouldn't your 50M case would alread have dtd's? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Feb 11 11:22:59 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:01 2004 Subject: RDF, Namespaces, and Versioning? In-Reply-To: <36C1AE8C.270AAB95@manhattanproject.com> References: <19990208104624.A4862@io.mds.rmit.edu.au> <370CE4A4.76D43491@prescod.net> <36BF35E7.D6FF0E5E@manhattanproject.com> <370CF891.21C0BDE9@prescod.net> <36C0CA35.89F83846@manhattanproject.com> <14017.29209.142742.434795@localhost.localdomain> <36C1AE8C.270AAB95@manhattanproject.com> Message-ID: <14018.551.118488.700382@localhost.localdomain> Clark Evans writes: > First, thank you for your reply. It was helpful. However, > I do know, the spec dosn't say _how_ to use namespaces. I think that this was an intelligent choice -- the idea was to provide a simple global naming service and then see how the market decides to use it (actual use will be an important input to the schema process). Canned use cases might just have confused things even further, since people would have assumed that they were in some way normative. After all, would early use cases for IP have included the Web? All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Feb 11 11:35:25 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:01 2004 Subject: DOM vs. SAX??? Nah. (was RE: Storing Lots of Fiddly Bits (was Re: What is XML for?) In-Reply-To: <85256714.008116FD.00@D51MTA03.pok.ibm.com> References: <85256714.008116FD.00@D51MTA03.pok.ibm.com> Message-ID: <14018.48845.540265.75532@localhost.localdomain> keshlam@us.ibm.com writes: > There's nothing wrong with SAX (though it too needs another turn of > the evolutionary crank, in my opinion), but SAX is a stream rather > than a model. The two really aren't in competition with each other > any more than sed is in competition with vi -- they're each good in > their own target domain, and there are even times when using one to > generate the other is the right answer. Wow! I hadn't been following this thread, and had no idea that there was a DOM vs. SAX flame war going on. Very cool. While I believe that some flame wars are justified -- Emacs really is better than vi, Java really is better than C++, Linux really is better than Windows, and my Border Collie really is better than anyone's Jack Russell Terrier, all on objective and clearly verifiable grounds -- in this case I agree with both of Joe's points: 1. SAX and DOM are complementary 2. SAX and DOM both need a little more work All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Feb 11 11:42:59 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:01 2004 Subject: The Peace Process: DOM and namespaces... In-Reply-To: <36C2187E.455D0329@infinet.com> References: <36C2011B.D188E44B@activated.com> <14018.952.766576.431128@localhost.localdomain> <36C20CD1.9A206D5D@infinet.com> <14018.3364.167089.666427@localhost.localdomain> <36C2187E.455D0329@infinet.com> Message-ID: <14018.49192.388408.608309@localhost.localdomain> Tyler Baker writes: > The DOM has an unstated implication that it reflects a valid XML > document. If you make a call to getNodeName() on an Element node, > it is expected to return a valid XML name. Perhaps, but a violation of an 'unstated implication' can hardly make something illegal (Tyler's original claim) -- what Tyler actually seems to be suggesting is that expanding QNames in the DOM goes against the original spirit of the API, not against the letter. > > The physical representation of an XML document (as defined by XML 1.0) > > is not allowed to have characters like '/' and '@' in element and > > attribute name, but the DOM is not a physical representation; it is an > > API providing access to one view of a document's information set, and > > as such, it is not governed by the Name production in XML 1.0. > This is one way of looking at it. But this is not clear and there > is no mechanism defined to tell an application whether the DOM is > using these illegal names or not. If you write the DOM Document > back out to XML, you are writing out illegal names because you > don't know if you are writing out prefixes + local part or > namespace + local part. You gotta check anyway -- what if someone's HTML DOM implementation were allowing names with illegal letters? Presumably, however, you have turned on namespace munging somewhere in your DOMBuilder (however that works), so you know what you're getting. Namespace munging should *never* take place by default for vanilla XML 1.0 processing (in Expat, for example, it is a user-configurable option, and for SAX 1.0, it is handled by third-party filters [which are surprisingly easy to write]). > > The XML 1.0 spec does not even require processors to report element > > names, so in terms of conformance, anything goes kids. > > How is anyone supposed to reliably build any sort of architecture > on XML if everything is this ambiguous. We're working on it, but you'd be surprised by what you can do even with partial specs. XML 1.0 defines the physical representation of a document as a string of characters; the DOM defines an API into structured information, such as XML and HTML documents. There is a WG right now working on the XML Information Set, which will provide some glue between the two -- I'll keep everyone posted. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Feb 11 11:45:25 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:01 2004 Subject: The Peace Process: DOM and namespaces... In-Reply-To: <36C21A27.69968FDC@infinet.com> References: <36C2011B.D188E44B@activated.com> <14018.952.766576.431128@localhost.localdomain> <36C20BEC.FEDBA22D@activated.com> <14018.3916.726579.503762@localhost.localdomain> <36C21A27.69968FDC@infinet.com> Message-ID: <14018.49624.990887.76240@localhost.localdomain> Tyler Baker writes: > > Preprocess your information, whatever its source. > > From an entire database. Pass over the entire document tree and > prepreprocess everything before actually presenting it to the > application. This is not practical. In my limited experience on > these matters I have seen this tried before and with horrendous > results. Nevertheless, it does not take a computer scientist to > see the real world problem with this approach. That would be silly -- lazy evaluation works fine for this kind of thing. I hate to sound stupid, but I still fail to see how Namespaces causes any problems at all for someone dynamically generating a document from a database -- if you want to use names with a URI part, use them; if not, don't. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Feb 11 11:48:28 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:01 2004 Subject: The Peace Process: DOM and namespaces... In-Reply-To: <36C21BF4.8F81DC9D@infinet.com> References: <3.0.32.19990210150228.00be03c0@pop.intergate.bc.ca> <36C21BF4.8F81DC9D@infinet.com> Message-ID: <14018.49829.500457.679827@localhost.localdomain> Tyler Baker writes: > It is not hard to support namespaces in an XML parser, but it is > far harder to deal with them at the application level. Just > because all the tools support namespaces, does not mean that anyone > is using them outside of a few niche applications. Like another > list member recently said "where is the content"? Every RDF document out there (a small but growing number), every XSL stylesheet, every HTML Voyager document, most schema implementations, etc., etc. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rick at activated.com Thu Feb 11 11:58:21 1999 From: rick at activated.com (Rick Ross) Date: Mon Jun 7 17:09:01 2004 Subject: The Peace Process: DOM and namespaces... References: <36C2011B.D188E44B@activated.com> <36C26100.8C18CB20@jclark.com> Message-ID: <36C2C566.FE04B536@activated.com> Thanks, James - this may be an idea which can solve the problem - we will explore this approach today. A solution within contexts of all present (finished and unfinished) standards would be great. Rick James Clark wrote: > > Rick Ross wrote: > > > the XSL working draft > > specification requires namespace support that apparently cannot be > > implemented effectively if the primary input source is a dynamically built > > DOM tree. > > I can't see this. > > Why can't you put a layer on top of the DOM that provides namespace > processing? For example, you could have an NSNode object that points to > the DOM Node and a set of prefix bindings (and probably a parent > NSNode). The NSNode objects will be temporary. You wouldn't have to > reparse the document, and you don't have to keep two trees in memory. > You can also provide other things in this layer that help XSL > performance such as document order comparison. > > James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rick at activated.com Thu Feb 11 12:01:07 1999 From: rick at activated.com (Rick Ross) Date: Mon Jun 7 17:09:01 2004 Subject: The Peace Process: DOM and namespaces... References: <36C2011B.D188E44B@activated.com> <36C26100.8C18CB20@jclark.com> Message-ID: <36C2C650.2CE0F690@activated.com> One of the benefits we are trying to deliver to customers, however, is the ability to use any parser that offers a standard DOM and SAX implementations. Could this approach be implemented solely within our processor-side logic, and rely only on standard implementations of DOM Level 1 that exist in most of today's XML parsers? Rick James Clark wrote: > > Rick Ross wrote: > > > the XSL working draft > > specification requires namespace support that apparently cannot be > > implemented effectively if the primary input source is a dynamically built > > DOM tree. > > I can't see this. > > Why can't you put a layer on top of the DOM that provides namespace > processing? For example, you could have an NSNode object that points to > the DOM Node and a set of prefix bindings (and probably a parent > NSNode). The NSNode objects will be temporary. You wouldn't have to > reparse the document, and you don't have to keep two trees in memory. > You can also provide other things in this layer that help XSL > performance such as document order comparison. > > James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rja at arpsolutions.demon.co.uk Thu Feb 11 12:04:51 1999 From: rja at arpsolutions.demon.co.uk (Richard Anderson) Date: Mon Jun 7 17:09:01 2004 Subject: Quite Standards Questions Message-ID: <001e01be55b6$a7005be0$c5010180@p197> Hi, In terms of standards, am I right or wrong in thinking that DCD replaced XML_Data ? Also, is DCD being replaced by something else ? I remember reading or hearing something but I cant remember what. Regards, Richard. *********************************************** * E-Mail mailto:RJA@arpsolutions.demon.co.uk * * WEB http://www.arpsolutions.demon.co.uk * *********************************************** -----Original Message----- From: Buss, Jason A To: 'xml-dev@ic.ac.uk' Date: 8 February 1999 22:38 Subject: RE: "Clean Specs" > >>Perhaps in this modern world, some of the rather large fees charged >>by W3C for membership could go towards hiring some technical writers >>to address this issue. IMNSHO, the amount of time that we've all >>spent thrashing about with namespaces is an example of intelligence, >>time and energy that could have been avoided by a standard that >>addressed some of the issues better. > >>If standards are the way we'll do business (and I'm all for that!) >>then why not invest in the best possible standards up front? Just >>because IETF and other traditions made do without, doesn't mean that >>we should be penny wise and pound foolish now. Clarity is a net gain >>for W3C members, and for the larger community, as the cost of >>incompatible implementations is significant. > >Avi > >At 5:40 PM -0600 2/7/99, W. Eliot Kimber wrote: >> The XML WG was an all-volunteer project, as are most standards efforts. >> Those of us who participated did so primarily as a personal commitment, >not >> as something our employers (those of us who have them) pay us to do. >> >> Standards development is not a commercial process--there is no budget from >> which technical writers might be hired. The W3C only administers, it does >> not fund. Same for ISO. Some national bodies do fund some standards >> development (BSI, the British Standards Institute), but that funding will >> tend to be used to support the technologists developing the standard and >> not writers crafting the words. >> >> So while it's true that most, if not all, specifications could benefit >from > > professional writers, it usually isn't an option for standards >developers > >> > Well, has anyone considered employing real, professional technical >> > authors to write the specifications? >> > >As chair of the DOM WG, I (and I think the editors of the specs) > >would be overjoyed were someone to volunteer the services of a > >real, professional technical author who could help in the process >of > >getting good specs out the door. However, as has been pointed out > >by others on this list, this support is difficult to find, as W3C > >seldom has these resources available. > > Maybe it is time some of us who have been "put off" by the way the >Namespaces recommendation to offer our services, under the auspices of the >WG for XML and XML related standards, to go through and annotate the drafts >and recommendations, as they come up for the vote. > > I didn't have trouble with the XML recommendation or the XSL working >draft. The DOM took me a couple of reads, and I have read the namespaces >recommendation 3 times and still have some questions, but I am looking here >and other places to find the answers before I climb up in here and get all >surly with the spec writers. > > I know there are a number of people who have read the spec and are >upset with the concept of namespaces. I am still trying to grasp parts of >it myself. But I think a lot of this is because I am a technical writer by >trade. I prepare documents for the end-user. I am conditioned to write >things from the perspective of the person actually utilizing the documents; >I still wince at typos. If I hadn't had the background in SGML that I have, >I would have been lucky to get past the XML spec itself. > > IMHO, if the working groups would like to see the services of >technical writers utilized, they should probably just come forward and ask. >I imagine through the W3C site or something. I think I have seen postings >from Paul saying he had been working on annotated versions of the >recommendations. If tech writers would like to see this, and it appears >that the WG's would appreciate the help, I don't see why efforts could be >made towards this. I know I would probably take up the opportunity to do >such work, even if it is on a voluntary basis. Even if some don't have the >time, surely someone would even a small amount of time to analyze and make >some notes, so if someone becomes available, they could come in with >something to start from. Even if it took a series of writers throughout the >development process, the outcome would likely justify the effort. > > Any suggestions? comments? > >Jason A. Buss >Single Engine Technical Publications >Cessna Aircraft Co. >jabuss@cessna.textron.com >"I don't have your solution, but I do admire your problem..." > > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Thu Feb 11 12:22:46 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:09:01 2004 Subject: Fw: DOM vs. SAX??? Nah. (was RE: Storing Lots of Fiddly Bits (was Re: What is XML for?) Message-ID: <027e01be55b8$78db1640$5402a8c0@oren.capella.co.il> David Megginson wrote: >Wow! I hadn't been following this thread, and had no idea that there >was a DOM vs. SAX flame war going on. Very cool. > >While I believe that some flame wars are justified -- Emacs really is >better than vi, Watch it! :-) >Java really is better than C++, Linux really is better >than Windows, and my Border Collie really is better than anyone's Jack >Russell Terrier, all on objective and clearly verifiable grounds -- in >this case I agree with both of Joe's points: > >1. SAX and DOM are complementary IMVHO SAX should be defined not as a "parser interface" but as a "DOM tree visitor interface". It should still be available as a separate API, but the DOM specs should provide a standard way to apply a SAX visitor to a DOM (sub)tree. A parser would be just a special case of an application which has an internal "virtual DOM tree" and doesn't provide random access to it. Once viewed this way, much of the motivation for the SAX vs. DOM wars would disappear. Have fun, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Thu Feb 11 12:30:37 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:09:01 2004 Subject: Quite Standards Questions Message-ID: <01BE55C1.CC640A70@grappa.ito.tu-darmstadt.de> Richard Anderson wrote: > In terms of standards, am I right or wrong in thinking that DCD replaced > XML_Data ? I said as much in my comparison of the various XML schema languages (see http://www.informatik.tu-darmstadt.de/DVS1/staff/bourret/bourret.htm). This was based on the statement in the DCD abstract that "The DCD proposal incorporates a subset of the XML-Data Submission...", the fact that DCD and XML-Data are both from Microsoft (as well as others), and David Megginson's statement to that effect (http://www.lists.ic.ac.uk/hypermail/xml-dev/9809/0183.html). > Also, is DCD being replaced by something else ? I remember reading or > hearing something but I cant remember what. Eventually, all of the XML schema languages will (hopefully) be replaced by the schema language currently being developed by the W3C. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rick at activated.com Thu Feb 11 12:35:18 1999 From: rick at activated.com (Rick Ross) Date: Mon Jun 7 17:09:01 2004 Subject: The Peace Process: DOM and namespaces... References: <36C2011B.D188E44B@activated.com> <14018.952.766576.431128@localhost.localdomain> <36C20CD1.9A206D5D@infinet.com> <14018.3364.167089.666427@localhost.localdomain> <36C2187E.455D0329@infinet.com> <14018.49192.388408.608309@localhost.localdomain> Message-ID: <36C2CD8D.6905EDBC@activated.com> David, It's true that there may be ambiguities in the specs, and there may even be special extensions within some XML and DOM implementations that extend the specs to add the capabilities we seem to need to provide truly effective and efficient support for namespaces in XML. The issue is, however, that we're trying to stick to the standards as rigorously as possible. We want to implement compliance with the namespaces in XML recommendation, and we want to do it effectively using plain-vanilla implementations of the DOM Level 1. To create or rely on non-standard extensions would be a tough pill to swallow - it would be so much better if it were possible to use only the existing (finished and unfinished) standards. The specs in the XML space are evolving fast, and this progress is a credit to the hard work of many well-intentioned and brilliant people - I have no intention of impugning it. It would be good for business, however, if there were planned "synch points" where groups of related and interdependent specs reached a targeted level of overall functionality together - in a compatible way. Tim Bray noted that perhaps the best solution here is to urge that support for namespaces in XSL be deferred until the next phase - so that initial XSL applications will not face the complexity and performance hit imposed by existing specs that do not deliver enough of what each other needs yet. Regards, Rick The specs related to XML are David Megginson wrote: > > Tyler Baker writes: > > > The DOM has an unstated implication that it reflects a valid XML > > document. If you make a call to getNodeName() on an Element node, > > it is expected to return a valid XML name. > > Perhaps, but a violation of an 'unstated implication' can hardly make > something illegal (Tyler's original claim) -- what Tyler actually > seems to be suggesting is that expanding QNames in the DOM goes > against the original spirit of the API, not against the letter. > > > > The physical representation of an XML document (as defined by XML 1.0) > > > is not allowed to have characters like '/' and '@' in element and > > > attribute name, but the DOM is not a physical representation; it is an > > > API providing access to one view of a document's information set, and > > > as such, it is not governed by the Name production in XML 1.0. > > > This is one way of looking at it. But this is not clear and there > > is no mechanism defined to tell an application whether the DOM is > > using these illegal names or not. If you write the DOM Document > > back out to XML, you are writing out illegal names because you > > don't know if you are writing out prefixes + local part or > > namespace + local part. > > You gotta check anyway -- what if someone's HTML DOM implementation > were allowing names with illegal letters? Presumably, however, you > have turned on namespace munging somewhere in your DOMBuilder (however > that works), so you know what you're getting. Namespace munging > should *never* take place by default for vanilla XML 1.0 processing > (in Expat, for example, it is a user-configurable option, and for SAX > 1.0, it is handled by third-party filters [which are surprisingly easy > to write]). > > > > The XML 1.0 spec does not even require processors to report element > > > names, so in terms of conformance, anything goes kids. > > > > How is anyone supposed to reliably build any sort of architecture > > on XML if everything is this ambiguous. > > We're working on it, but you'd be surprised by what you can do even > with partial specs. XML 1.0 defines the physical representation of a > document as a string of characters; the DOM defines an API into > structured information, such as XML and HTML documents. There is a WG > right now working on the XML Information Set, which will provide some > glue between the two -- I'll keep everyone posted. > > All the best, > > David xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Feb 11 12:43:07 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:09:02 2004 Subject: The Peace Process: DOM and namespaces... References: <36C2011B.D188E44B@activated.com> <36C26100.8C18CB20@jclark.com> <36C26C30.6436AB4B@infinet.com> <36C28949.D23A8262@jclark.com> Message-ID: <36C2CE29.AB9B2133@infinet.com> James Clark wrote: > Tyler Baker wrote: > > In terms of O notation (I hate to be academic here) that is N^2 plus the underlying cost of > > string comparisons. > > Huh? The time for each element is linear in the number of attributes > that the element has. I guess it all depends on how you look at things. If you consider an XML document as a set of nodes that you must iterate over and you consider all attributes to be sets of nodes within each element node you get N^2 time. If you consider attributes to be nodes of the XML document then you get N time. I guess it depends on how you look at things here. I chose to look at the entire document as a set of elements, with each element containing a set of zero or more attributes. > > And you still need to reparse the entire document each time to safely create these bindings as > > the source tree can mutate at any time. > > You need to do the namespace processing each time. You don't have to do > the parsing each time. This is what I meant (sorry for the confusion). I meant that you need to in effect reparse the document whether it be a stream, sax events, or a DOM tree and rebuild your entire namespace map each time. Nevertheless, this is not cheap. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Feb 11 12:49:17 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:09:02 2004 Subject: The Peace Process: DOM and namespaces... References: <36C2011B.D188E44B@activated.com> <14018.952.766576.431128@localhost.localdomain> <36C20CD1.9A206D5D@infinet.com> <14018.3364.167089.666427@localhost.localdomain> <36C2187E.455D0329@infinet.com> <14018.49192.388408.608309@localhost.localdomain> Message-ID: <36C2D049.A302F374@infinet.com> David Megginson wrote: > Tyler Baker writes: > > > The DOM has an unstated implication that it reflects a valid XML > > document. If you make a call to getNodeName() on an Element node, > > it is expected to return a valid XML name. > > Perhaps, but a violation of an 'unstated implication' can hardly make > something illegal (Tyler's original claim) -- what Tyler actually > seems to be suggesting is that expanding QNames in the DOM goes > against the original spirit of the API, not against the letter. True. A clarification how how much leeway the application developer has on these matters would be something that might be in order for DOM Level 1 Errata. > > > The physical representation of an XML document (as defined by XML 1.0) > > > is not allowed to have characters like '/' and '@' in element and > > > attribute name, but the DOM is not a physical representation; it is an > > > API providing access to one view of a document's information set, and > > > as such, it is not governed by the Name production in XML 1.0. > > > This is one way of looking at it. But this is not clear and there > > is no mechanism defined to tell an application whether the DOM is > > using these illegal names or not. If you write the DOM Document > > back out to XML, you are writing out illegal names because you > > don't know if you are writing out prefixes + local part or > > namespace + local part. > > You gotta check anyway -- what if someone's HTML DOM implementation > were allowing names with illegal letters? Presumably, however, you > have turned on namespace munging somewhere in your DOMBuilder (however > that works), so you know what you're getting. Namespace munging > should *never* take place by default for vanilla XML 1.0 processing > (in Expat, for example, it is a user-configurable option, and for SAX > 1.0, it is handled by third-party filters [which are surprisingly easy > to write]). Yah this is another issue which makes me think that splitting XML 1.0 and XML 1.0 with "Namespaces in XML" into two separate data type entities would be a good thing for the XML community. > > > The XML 1.0 spec does not even require processors to report element > > > names, so in terms of conformance, anything goes kids. > > > > How is anyone supposed to reliably build any sort of architecture > > on XML if everything is this ambiguous. > > We're working on it, but you'd be surprised by what you can do even > with partial specs. XML 1.0 defines the physical representation of a > document as a string of characters; the DOM defines an API into > structured information, such as XML and HTML documents. There is a WG > right now working on the XML Information Set, which will provide some > glue between the two -- I'll keep everyone posted. This looks nice. I agree keep things as abstract as you can as long as you don't leave any gaping holes that cause major side effects elsewhere. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Feb 11 12:54:14 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:09:02 2004 Subject: The Peace Process: DOM and namespaces... References: <36C2011B.D188E44B@activated.com> <14018.952.766576.431128@localhost.localdomain> <36C20BEC.FEDBA22D@activated.com> <14018.3916.726579.503762@localhost.localdomain> <36C21A27.69968FDC@infinet.com> <14018.49624.990887.76240@localhost.localdomain> Message-ID: <36C2D232.487283E@infinet.com> David Megginson wrote: > Tyler Baker writes: > > > > Preprocess your information, whatever its source. > > > > From an entire database. Pass over the entire document tree and > > prepreprocess everything before actually presenting it to the > > application. This is not practical. In my limited experience on > > these matters I have seen this tried before and with horrendous > > results. Nevertheless, it does not take a computer scientist to > > see the real world problem with this approach. > > That would be silly -- lazy evaluation works fine for this kind of > thing. I hate to sound stupid, but I still fail to see how Namespaces > causes any problems at all for someone dynamically generating a > document from a database -- if you want to use names with a URI part, > use them; if not, don't. Well you have the choice of doing an entire pass over the document and building a map directly or else try and lazily evaluate things and then cache the results of the namespace processing. Then for each time you find a node, you need to look it up in a hashtable somehow. Hashtables are cheap, but not that cheap if you need to do a table lookup for every node you process. So iterating over the entire source tree and building an indexed table may in some circumstances be more efficient than the lazy approach. But you are still faced with the problem of illegal XML Names. If you write the DOM Document out to an XML file or stream you will be emitting illegal XML Names unless you have some URI -> prefix hack to get things back to legal XML. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Feb 11 12:57:10 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:09:02 2004 Subject: The Peace Process: DOM and namespaces... References: <3.0.32.19990210150228.00be03c0@pop.intergate.bc.ca> <36C21BF4.8F81DC9D@infinet.com> <14018.49829.500457.679827@localhost.localdomain> Message-ID: <36C2D325.6B2D47E1@infinet.com> David Megginson wrote: > Tyler Baker writes: > > > It is not hard to support namespaces in an XML parser, but it is > > far harder to deal with them at the application level. Just > > because all the tools support namespaces, does not mean that anyone > > is using them outside of a few niche applications. Like another > > list member recently said "where is the content"? > > Every RDF document out there (a small but growing number), every XSL > stylesheet, every HTML Voyager document, most schema implementations, > etc., etc. These are W3C applications of "Namespaces" and are not from independent sources. I am speaking of content outside of the scope and control of the W3C. I am only dealing with namespaces in XSL because it is in effect being forced on me. I am not using it by choice. Proof of an original application which chooses to use "Namespaces in XML" would help invalidate my doubts as to "Namespaces in XML" having any use outside of the W3C specs. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Thu Feb 11 13:04:56 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:02 2004 Subject: The Peace Process: DOM and namespaces... Message-ID: <006101be55be$5e549020$c9a8a8c0@thing2> One of the problems with DOM/namespace is the difficulty of manipulating the tree while tracking namespace scoping. A partially lazy approach seems to work: 1. Do NOT make any namespace transformations while building the tree. Instead, propagate the xmlns attributes to all children. (Use the inheritance filter.) 2. Manipulate the tree as desired. 3. Layer additional methods onto the DOM (you can either subclass things or use static methods) for accessing the unique name based on the xmlns attributes attached to the Element being accessed. 4. When converting from the DOM to SAX events or when creating a document from the DOM, drop xmlns attributes which are assigned the same value in the parent element. OK, it means the application will likely need to be namespace aware to work with this. Overhead may be a bit of an issue, but then it always is when using a DOM. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From heretic at ihug.co.nz Thu Feb 11 13:10:48 1999 From: heretic at ihug.co.nz (David Mohring) Date: Mon Jun 7 17:09:02 2004 Subject: an initial idea for an XML =?iso-8859-1?Q?=FCberdocument?= shell ( was Re: CORBA's not boring yet. / XML in an OS?) References: <008801be538b$3326e860$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: <36C2D69C.9E5328EB@ihug.co.nz> This could also be called a Compound Document. See http://www.lists.ic.ac.uk/hypermail/xml-dev/9902/0101.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rick at activated.com Thu Feb 11 13:29:29 1999 From: rick at activated.com (Rick Ross) Date: Mon Jun 7 17:09:02 2004 Subject: The Peace Process: DOM and namespaces... References: <006101be55be$5e549020$c9a8a8c0@thing2> Message-ID: <36C2DB03.193FEC7B@activated.com> It is item 3. that seems to be the killer - I cannot layer anything onto your standard DOM implementations. If my design goal is to get the job done by relying only on standard interfaces, then success seems unlikely. Gosh, I would LOVE to learn I am missing the point, I would LOVE to have someone clarify the standards-compliant approach that resolves this predicament without exacting too high a performance penalty. I really thank people for trying to come up with these ideas, and I remain hopeful that the solution is still lurking out there. Regards, Rick Bill la Forge wrote: > > One of the problems with DOM/namespace is the difficulty of manipulating > the tree while tracking namespace scoping. A partially lazy approach seems to work: > > 1. Do NOT make any namespace transformations while building the tree. > Instead, propagate the xmlns attributes to all children. (Use the inheritance filter.) > > 2. Manipulate the tree as desired. > > 3. Layer additional methods onto the DOM (you can either subclass things or > use static methods) for accessing the unique name based on the xmlns attributes > attached to the Element being accessed. > > 4. When converting from the DOM to SAX events or when creating a document from the > DOM, drop xmlns attributes which are assigned the same value in the parent element. > > OK, it means the application will likely need to be namespace aware to work with > this. > > Overhead may be a bit of an issue, but then it always is when using a DOM. > > Bill > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Feb 11 13:33:00 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:02 2004 Subject: Misattribution (was Re: The Peace Process: DOM and namespaces...) In-Reply-To: <0003438b96feca7f_mailit@mail.iname.com> References: <36C21BF4.8F81DC9D@infinet.com> <0003438b96feca7f_mailit@mail.iname.com> Message-ID: <14018.55968.901687.855408@localhost.localdomain> Rob Schoening writes: > >Tim Bray wrote: > >It is not hard to support namespaces in an XML parser, but it is > >far harder to deal with them at the application level. Just > >because all the tools support namespaces, does not mean that > >anyone is using them outside of a few niche applications. Like > >another list member recently said "where is the content"? I think that Tyler wrote this, not Tim. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Feb 11 13:36:19 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:02 2004 Subject: The Peace Process: DOM and namespaces... In-Reply-To: <00f001be5559$c00c8990$2ee044c6@arcot-main> References: <00f001be5559$c00c8990$2ee044c6@arcot-main> Message-ID: <14018.56053.926976.427251@localhost.localdomain> Don Park writes: > I guess what I am proposing is a creation of a new URI scheme for XML > Namespace that is more friendly to namespace-ignorant applications. The URI > production rules are defined as: > > uri ::= scheme : path [ ? search ] > > We can use 'xmlns' as the URI scheme name, outlaw the optional search > segment, and then use Java package naming scheme for the path. > [...] That seems like a very elaborate workaround to the very simple (and temporary) problem that the interfaces are lagging a little behind the specs. Name munging for SAX and the DOM works fine, as long as the application knows to expect it (i.e. the application should specifically request it; munged names should not come down by surprise). As a matter of fact, I am pretty much convinced to keep name munging for SAX 1.1, but with an explicit mechanism for enabling or disabling namespace processing (as we've been discussing here). DOM, being more elegant, will probably develop a Name interface for 2.0. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Thu Feb 11 13:52:06 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:02 2004 Subject: Roll-Your-Own Parsers (was: Re: What Clean Specs Achieve) In-Reply-To: <004b01be557c$15d2e140$c9a8a8c0@thing2> Message-ID: <199902111350.IAA02839@hesketh.net> At 12:05 AM 2/11/99 -0500, Bill la Forge wrote: >Anyway, the point here is that a reasonable approach to dynamic configuration >should give you a very small footprint when only selected features are used. >This in contrast to a does-everything monolithic parser. And as the specs >mature, it seems likely that the larger companies represented at the W3C >have no reason at all to keep the specs small and lightweight--cumbersom >specs nicely eliminates much of the competition! > >So it seems the choice is clear... expect your parsers to grow till they have more >features than MS Word, or take a configurable approach which allows the developer >to select the capabilities necessary for the job at hand. It looks like someone else may be stepping in with possible disruption for the roll-your-own process: Sun. From the Infoworld cover story (http://www.infoworld.com/cgi-bin/displayStory.pl?99028.ehsunxml.htm): >"We had to let the dust settle and determine what makes sense for XML so >we have a clear path," said Nancy Lee, product manager for XML at Java >Software, a Sun division in Mountain View, Calif. > >The company's next step is to provide a standard Java API to support XML, >Lee said. The API will go through the Java Community Process, a multivendor >standards process for Java technologies. Not to pick on Sun, but it's still pretty unclear how open their process is really going to be. Do we really need another 'multivendor standards process' to tell us all what to do our information? How about building on what we've already got, and keeping it at least as open (very for SAX, not as open for DOM) as it's been so far? With any luck, we'd be able to keep the bloat to a minimum, at least. Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rick at activated.com Thu Feb 11 14:03:37 1999 From: rick at activated.com (Rick Ross) Date: Mon Jun 7 17:09:02 2004 Subject: The Peace Process: DOM and namespaces... References: <00f001be5559$c00c8990$2ee044c6@arcot-main> <14018.56053.926976.427251@localhost.localdomain> Message-ID: <36C2E310.76634182@activated.com> David, this is minimizing the real business problems created by the lag. A lag of 6 months in the Internet can be a "make or break" difference to businesses - how many "Internet years" is that now? Rick David Megginson wrote: > > That seems like a very elaborate workaround to the very simple (and > temporary) problem that the interfaces are lagging a little behind the > specs. > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Feb 11 14:06:11 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:02 2004 Subject: Schema Muddle (was Quite Standards Questions) In-Reply-To: <001e01be55b6$a7005be0$c5010180@p197> References: <001e01be55b6$a7005be0$c5010180@p197> Message-ID: <14018.57594.368035.666592@localhost.localdomain> Richard Anderson writes: > In terms of standards, am I right or wrong in thinking that DCD > replaced XML_Data ? It depends on who you talk to at Microsoft. When I was on stage at a major conference last Fall, I asked an MS person (whom I will not name) about the issue in front of the whole room and received public confirmation that XML-Data was dead, long live DCD! The problem is that MS hasn't really gotten around to writing any software for DCD (and probably won't), while some of their older stuff still provides spotty support for random bits of XML-Data, so you will sometimes hear MS people pitching XML-Data (and sometimes not). I suspect that there are different units within MS that don't always get around to communicating with each other. > Also, is DCD being replaced by something else ? I remember reading > or hearing something but I cant remember what. Yeah, DCD is just one input to the W3C's XML Schema WG (to which I do not belong). As an outsider, I'd guess that it is very unlikely that either DCD or XML-Data (or SOX, or DDML n?e XSchema n?e something else that I can't remember) will come out unmodified as the official W3C schema format, but the ultimate solution will probably include bits from all of them. Right now, use whatever you feel like, but don't let yourself be duped by any vendors (big or small) into thinking that their pet format will have a shelf life of more than a year or so. In other words, feel free to experiment, but don't build a $10M system around anything right now. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Feb 11 14:10:45 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:02 2004 Subject: The Peace Process: DOM and namespaces... In-Reply-To: <36C2CD8D.6905EDBC@activated.com> References: <36C2011B.D188E44B@activated.com> <14018.952.766576.431128@localhost.localdomain> <36C20CD1.9A206D5D@infinet.com> <14018.3364.167089.666427@localhost.localdomain> <36C2187E.455D0329@infinet.com> <14018.49192.388408.608309@localhost.localdomain> <36C2CD8D.6905EDBC@activated.com> Message-ID: <14018.58253.278953.162091@localhost.localdomain> Rick Ross writes: > Tim Bray noted that perhaps the best solution here is to urge that > support for namespaces in XSL be deferred until the next phase - so > that initial XSL applications will not face the complexity and > performance hit imposed by existing specs that do not deliver > enough of what each other needs yet. Did Tim say this? I'll have to let him clarify whether he agrees with it or not, but as I recall, someone else made the suggestion, and Tim simply said that it was a worthwhile point to make and should be brought to the XSL WG. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From keshlam at us.ibm.com Thu Feb 11 15:23:12 1999 From: keshlam at us.ibm.com (keshlam@us.ibm.com) Date: Mon Jun 7 17:09:02 2004 Subject: The Peace Process: DOM and namespaces... Message-ID: <85256715.005454D1.00@D51MTA03.pok.ibm.com> DOM Level 2 is discussing namespace support now. If you have opinions on how it ought to operate, you might want to send them to the public DOM mailing list (which hasn't gotten a lot of traffic since Level 1 came out). The problem with living on the bleeding edge is that we're all playing catch-up with each other, and there's too much happening for everyone to stay perfectly in synch... (Personally, I'm of the community that's underimpressed with namespaces as they wound up being defined; I think they fall short of what was needed. But they're what we've got, and we have to either live with them or ignore them.) ______________________________________ Joe Kesselman / IBM Research Unless stated otherwise, all opinions are solely those of the author. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Feb 11 16:06:07 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:02 2004 Subject: The Peace Process: DOM and namespaces... Message-ID: <3.0.32.19990211080015.00bb18e0@pop.intergate.bc.ca> At 07:55 AM 2/11/99 -0500, Tyler Baker wrote: >Proof of an original application which chooses to use "Namespaces in XML" would help >invalidate my doubts as to "Namespaces in XML" having any use outside of the W3C specs. Office 2000. Only last time I looked, it wasn't remotely XML-conformant, sigh. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Thu Feb 11 16:07:36 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:03 2004 Subject: The Peace Process: DOM and namespaces... Message-ID: <3.0.32.19990211080514.00baf680@pop.intergate.bc.ca> At 09:06 AM 2/11/99 -0500, David Megginson wrote: >Rick Ross writes: > > > Tim Bray noted that perhaps the best solution here is to urge that > > support for namespaces in XSL be deferred until the next phase > >Did Tim say this? I'll have to let him clarify whether he agrees with >it or not I said it was a consistent course of action. I don't agree with it. Rick's problem is that he wants to implement XSL (not stable yet) with DOM 1.0, which doesn't have everything you need to do it. My contention is that since XSL 1.0 and DOM level 2 will arrive at more or less the same time, it's reasonable for XSL to depend on things in DOM level 2. Rick, it seems, would like XSL to throw out everything that can't be supported with XML 1.0 and DOM level 1, because he wants a conformant application of an unfinished spec and he wants it now. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Thu Feb 11 16:19:31 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:09:03 2004 Subject: The Peace Process: DOM and namespaces... Message-ID: <5F052F2A01FBD11184F00008C7A4A80001136B2E@eukbant101.ericsson.se> [removed XSL list from distribution list - it didn't seem relevant] > -----Original Message----- > From: Tyler Baker [SMTP:tyler@infinet.com] > > Proof of an original application which chooses to use "Namespaces in XML" > would help > invalidate my doubts as to "Namespaces in XML" having any use outside of > the W3C specs. > Someone I know is working on being able to combine XML and HTML to be able to edit fragments of XML in a web browser, sort of like this: Value It uses the namespace processing of expat. Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rick at activated.com Thu Feb 11 16:21:15 1999 From: rick at activated.com (Rick Ross) Date: Mon Jun 7 17:09:03 2004 Subject: The Peace Process: DOM and namespaces... References: <3.0.32.19990211080514.00baf680@pop.intergate.bc.ca> Message-ID: <36C302F7.2AD04DF@activated.com> In truth, I do not want to engage contention or be argumentative - I simply want a solution that doesn't involve postponing the development of significant business by as much as a half-year or more. One thing that happens all too often in listserv discussions is that people "take positions" - I am not doing this - I am looking for a solution to a real, present-tense business problem. If everyone who wanted to use C++ had been told to wait for a solution until the standard was finalized, then that would have led to years and years of delay. There are real, valid arguments for making "tuning" decisions along the way to make important business benefits possible. I would hope that the working groups, the entities that sponsor them, and the community of developers who desire to leverage this technology can see the powerful value of small compromise now to make productive business possible today. "Wait until it is final" is very, very difficult to swallow - especially since there could be lots of pitfalls and speedbumps in the road ahead. A simple change makes more, valuable business possible now. Again, I am not looking to confront, but rather to plead that this is important. The "deferral" approach favors the big companies who can always play the long-term game, they have the resources to do that. Innovation, however, may often come from the newcomer or the little guy, and we all benefit from inovation becoming real sooner... Regards, Rick Tim Bray wrote: > > At 09:06 AM 2/11/99 -0500, David Megginson wrote: > >Rick Ross writes: > > > > > Tim Bray noted that perhaps the best solution here is to urge that > > > support for namespaces in XSL be deferred until the next phase > > > >Did Tim say this? I'll have to let him clarify whether he agrees with > >it or not > > I said it was a consistent course of action. I don't agree with it. > Rick's problem is that he wants to implement XSL (not stable yet) > with DOM 1.0, which doesn't have everything you need to do it. My > contention is that since XSL 1.0 and DOM level 2 will arrive at more > or less the same time, it's reasonable for XSL to depend on things > in DOM level 2. Rick, it seems, would like XSL to throw out > everything that can't be supported with XML 1.0 and DOM level 1, > because he wants a conformant application of an unfinished spec > and he wants it now. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Thu Feb 11 16:33:02 1999 From: Michael.Kay at icl.com (Michael.Kay@icl.com) Date: Mon Jun 7 17:09:03 2004 Subject: The Peace Process: DOM and namespaces... Message-ID: <93CB64052F94D211BC5D0010A80013310EB2E6@wwmessd3.bra01.icl.co.uk> Don Park: > We can use 'xmlns' as the URI scheme name, outlaw the optional search > segment, and then use Java package naming scheme for the > path. Here is an example: > > > Great suggestion. One of the biggest sources of confusion about namespaces is that many of the examples use URI's beginning with "http://", which leads people (and too-clever-by-half software) to think that the URI is referring to a resource accessible via the HTTP protocol. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Thu Feb 11 16:45:59 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:03 2004 Subject: The Peace Process: DOM and namespaces... In-Reply-To: <93CB64052F94D211BC5D0010A80013310EB2E6@wwmessd3.bra01.icl. co.uk> Message-ID: <199902111645.LAA06150@hesketh.net> At 03:45 PM 2/11/99 +0000, Michael.Kay@icl.com wrote: >Don Park: >> We can use 'xmlns' as the URI scheme name, outlaw the optional search >> segment, and then use Java package naming scheme for the >> path. Here is an example: >> >> >> >Great suggestion. One of the biggest sources of confusion about namespaces >is that many of the examples use URI's beginning with "http://", which leads >people (and too-clever-by-half software) to think that the URI is referring >to a resource accessible via the HTTP protocol. Agreed - except that some spec writers (_if_ I remember right, I think including RDF) want that URL to actually point to something that _can_ be used by the processor. The namespaces spec itself avoids doing this, but other specs can take that approach on their own recognizance. (I don't believe it's prohibited.) Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wunder at infoseek.com Thu Feb 11 17:39:16 1999 From: wunder at infoseek.com (Walter Underwood) Date: Mon Jun 7 17:09:03 2004 Subject: Fw: DOM vs. SAX??? Nah. (was RE: Storing Lots of Fiddly Bits (was Re: What is XML for?) In-Reply-To: <027e01be55b8$78db1640$5402a8c0@oren.capella.co.il> Message-ID: <3.0.5.32.19990211093001.00b57360@corp> At 02:17 PM 2/11/99 +0200, Oren Ben-Kiki wrote: >David Megginson wrote: >> >>1. SAX and DOM are complementary > > >IMVHO SAX should be defined not as a "parser interface" but as a "DOM tree >visitor interface". We use a fair amount of XML inside Infoseek, and were just having this DOM vs. SAX discussion on Monday. There are applications that really are interested in the document, and the DOM interface is a tremendous help for those. For some other applications, the DOM is a total waste of time -- they need to turn the contents of the document into application data (maybe objects, maybe not), and creating DOM objects for everything an unnecessary step that slows things down and bloats code. An example of the latter is the XML text extractor in the Ultraseek Server search engine. It needs to convert the incoming XML document to fieldname/textbuffer pairs so they can be further analyzed and inserted into the search index. The expat handlers are about 80 lines of Python. Works great. Other applications use XML in an RPC-like manner. Those parsers need to behave like an RPC marshalling parser, oriented towards translating into user structures/objects, not RPC- or XML-centered objects. We are using both SAX and DOM interfaces here. And C++ and Java and Python. But always editing the code with Emacs. wunder -- Walter R. Underwood wunder@infoseek.com wunder@best.com (home) http://software.infoseek.com/cce/ (my product) http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From keshlam at us.ibm.com Thu Feb 11 17:55:18 1999 From: keshlam at us.ibm.com (keshlam@us.ibm.com) Date: Mon Jun 7 17:09:03 2004 Subject: XML Parser with DOM in C Message-ID: <85256715.006246F3.00@D51MTA03.pok.ibm.com> >It is a little contradictory that >DOM uses IDL, and hence is useful for networked access to tree objects, The DOM uses IDL strictly as a language-independent way of describing the spec. That should not be taken as a promise that all, or even most DOM API implementations will be network-accessible. ______________________________________ Joe Kesselman / IBM Research Unless stated otherwise, all opinions are solely those of the author. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Thu Feb 11 18:11:31 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:09:03 2004 Subject: Fw: DOM vs. SAX??? Nah. (was RE: Storing Lots of FiddlyBits (was Re: What is XML for?) References: <3.0.5.32.19990211093001.00b57360@corp> Message-ID: <36C31C56.D77E3F66@manhattanproject.com> Walter Underwood wrote: > But always editing the code with Emacs. I'd switch to THE, an XEDIT clone. It's far superior. *evil grin* :) Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Thu Feb 11 18:15:40 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:03 2004 Subject: Fw: DOM vs. SAX??? Nah. (was RE: Storing Lots of Fiddly Bits (was Re: What is XML for?) References: <027e01be55b8$78db1640$5402a8c0@oren.capella.co.il> Message-ID: <36C31E2E.341762A@locke.ccil.org> Oren Ben-Kiki wrote: > [T]he > DOM specs should provide a standard way to apply a SAX visitor to a DOM > (sub)tree. A parser would be just a special case of an application which has > an internal "virtual DOM tree" and doesn't provide random access to it. See http://www.ccil.org/~cowan/XML/DOMParser.java for code that does just this: it walks a org.w3c.dom.Document and fires SAX events based on its contents. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Thu Feb 11 18:30:13 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:09:03 2004 Subject: Fw: DOM vs. SAX??? Nah. (was RE: Storing Lots of FiddlyBits (was Re: What is XML for?) Message-ID: <36C320CB.67415DD1@manhattanproject.com> Walter Underwood wrote: > At 02:17 PM 2/11/99 +0200, Oren Ben-Kiki wrote: > >David Megginson wrote: > >>1. SAX and DOM are complementary > >IMVHO SAX should be defined not as a "parser interface" but as a "DOM tree > >visitor interface". > > We use a fair amount of XML inside Infoseek, and were just having > this DOM vs. SAX discussion on Monday. There are applications that > really are interested in the document, and the DOM interface is a > tremendous help for those. For some other applications, the DOM is > a total waste of time -- they need to turn the contents of the > document into application data (maybe objects, maybe not), and > creating DOM objects for everything an unnecessary step that slows > things down and bloats code. > > An example of the latter is the XML text extractor in the Ultraseek > Server search engine. It needs to convert the incoming XML document > to fieldname/textbuffer pairs so they can be further analyzed and > inserted into the search index. The expat handlers are about 80 lines > of Python. Works great. > > Other applications use XML in an RPC-like manner. Those parsers > need to behave like an RPC marshalling parser, oriented towards > translating into user structures/objects, not RPC- or XML-centered > objects. > Neet-O. I heard this quote > SAX is just a way to populate DOM, it's at lower level. a while back and it gave me convulsions. Glad to hear the real-world experience. *smile* Clark Evans P.S. Older posts on the subject that may be of use.... I spend too much time e-mailing. -------- Original Message -------- Subject: Re: Forking the DOM (was Re: Storing Lots of Fiddly Bits) Date: Wed, 03 Feb 1999 17:29:15 +0000 From: Clark Evans To: XML Developers' List References: <000e01be4f36$c7b29370$d3228018@jabr.ne.mediaone.net> <199902031447.JAA20034@hesketh.net> "Simon St.Laurent" wrote: > > Given the fairly strong comments excerpted below (and Paul's not the only > one muttering like this), is it time to contemplate a very different API? I think that (at least) one other (very different) API already exists, it's called SAX. Perhaps the contemplating to do is in pattern study. Identifying the various ways to view information and creating a taxonomy (or pattern language) of *existing* tools and methods which can best handle those view may be the ticket. I feel that having a generative grammer of tools and techniques is better than one big sledge hammer. Although, I must admit, the sledge hammer is much more impressive and seductive. For large amounts of "simple" information with complex interaction I have found through experience that modeling the Relations (and implementing with an RDBMS) works wonderfully. Also, for large complicated inter-related processing units, I have found that modeling with Objects to be very useful (and implementing in an OOPL). However, every time I have tried, modeling time-oriented streams in a Relational database or with an Object language, the result has been UGLY. It just dosn't work well, I call it a "mis-fit". Once I saw XML, it clicked. And it clicked hard. You use XML to implment stream-oriented messaging systems. XML is very loosely typed, and with Architectural forms it is actually the reverse of object-orientation! Explanation: A given data stream has multiple classifications depending upon an observer's context. In object-orientation, an object is a single class. Where objects are encapsulated so that a single behavior is tied to a given data structure, a stream is exposed so that multiple behaviors can be activated from a single event. By trying to shoe-horn Messages into objects, you must assign a single class and a single behavior, and thus you loose the very thing which makes XML powerful. Of course, you can use object-orientation to model your message-processing sofware, just as you can use message-orentation to model your object-processing software. SAX is a good example of the former, while Scenerios is a good example of the latter. I don't doubt the power of recursive application of the pattern. Here is a related post I made to the news server last week. My thoughts have changed some, but the primary argument that treating something as an Object and as a Stream are very different perspectives and that the DOM and SAX interfaces reflect this reality. > Subject: DOM and SAX: complementary aspects of XML > Date: Thu, 28 Jan 1999 19:50:22 +0000 > From: Clark Evans > Newsgroups: comp.text.xml, comp.text.sgml> > > I saw some debating a while back, stating "Which > should I use DOM or SAX". Then a few people were > stating that the W3C had standardized on DOM and > not on SAX. This puzzled me. Anyway, I spend > a small amount of time reviewing each and it kinda > struck me what was going on. So, I figured I'd take > a crack at the explanation... > > DOM and SAX are "complementary" ways to look at an XML document. > > SAX - XML AS STREAM, A PUSH MODEL > ~~~ > > SAX views the document as a stream, sending events > as the document passes through its view: > > SOURCE ==XML==> DESTINATION (Bit Bucket?) > ^ > | > SAX > | > E (Event Notification) > | > Your | > App <--<-+ > Prog > > In this diagram, I picture an information > stream where XML documents move from the > SOURCE to the DESTINATION. I picture SAX > as an "Observer" (see design pattern book), > picking off the events of interest and > passing along notificaions to your application > program. > > Big Advantage: You don't have to store the stream. > Big DisAdvantage: You can't go back in time. > I really missed the big point of Archetectural Forms here. Where the Stream can be Observed by more than one Object, each with a different context. Thus the Stream has multiple Classifications depending upon the Object which is doing the Observing. > DOM - XML AS OBJECT, A PULL MODEL > ~~~ > > +-------+ > | XML | > | STORE | <==> DOM > +-------+ | | > | | > Your >-(Request)-->-+ | > App | > Prog <-(Response)-<---+ > > In this diagram, I picture a storage > facility, be it memory, disk, database, > etc., which holds the XML document for > random access. I picture DOM as a Broker? > answering requests from your application > program about the structure and content > of the XML document. > > Big Advantage: You have random access to the > document object. > Big DisAdvantage: You must provide storage for the > document object. > As the complement, here the Stream is treated as an Object, so that a single Classification mechanism is applied. For XML, I feel that this is much less useful than SAX. If you want to go this far with objects, perhaps it's better to "translate" the Stream (using SAX) into an Object Framework that better reflects your problem domain. By trying to avoid the "impedence" mismatch, you undermine the relative strengths of both object-oriented systems and message-oriented middleware, and end-up with a cripled compromise. I guess if the information in the XML document has a single Observer, then DOM will work well, but then the question becomes, why XML? Just use object serialization. If you absolutely must use XML for buzz-word compliance, have your serilization library use XML. Then you don't need DOM at all. > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Conclusion: > > I would not say that one is better than the other. > A true XML programmer should learn both. For > sequential access, SAX would be preferred since > it requires less memory. For random access, you > should use DOM. Any "complete" XML parsing tool > should support both attitudes of reasoning to > best adapt to the users needs. > > It is tempting to pick one and only use that one, > however, additional coding to store previous > state will be the penalty for those who always > use SAX, and extra memory/cycles will be the > penalty for those who always use DOM. > > As for the W3C, if both are not equally treated > as standards, I respectfully submit that they > should both be approved since they are both > complementary aspects of the same process. > > A unified standard, DOMSAX should be the result. > Hope this helps! Here I was not suggesting that they are "unified" into "one" language, but rather "interwoven" so that a programmer can switch back and forth depending upon his context. > > Question: For SGML, do NSGMLS and GROVE > share the same complementary pattern? > > Best wishes, > > Clark Evans Here is another followup post. > -------- Original Message -------- > Subject: Re: DOM and SAX: complementary aspects of XML > Date: Thu, 28 Jan 1999 21:18:55 +0000 > From: Clark Evans > Organization: Posted via RemarQ, http://www.remarQ.com - Discussions start here! > CC: "Joseph Kesselman (yclept Keshlam)" ,dan@capecod.net > Newsgroups: comp.text.xml,comp.text.sgml > References: <36B0BF7E.4E0E09F@manhattanproject.com> <36B0C97E.87535497@alum.mit.edu> > > "Joseph Kesselman (yclept Keshlam)" wrote: > > > > you _can_ use both DOM and SAX as part of the > > processing stream for a single document. > > Glad to hear that some people doing XML don't > think of it as a exclusive choice. This is > great. Would you comment on the following? > > Would it be safe to put XML processing on a Spectrum? Say.. > > > STREAM <--------------?---------------> OBJECT > > Where a particular use of XML would fit somewhere in-between? > > "Mostly" Stream Examples: > ~~~~~~~~~~~~~~~~~~~~~~~~~ > * Production information from a peice of equipment on > the plant floor, say the torque measurement on a drill. > * The NASDAQ ticker tape during the middle of the day. > * A video camera feed from a live site. > * etc. > > In these cases SAX might be the better choice, since it > is the _event_ that is of interest ... waiting untill the > entire stream is processed before acting would be a bad > idea (in the case of NASDAQ, perhaps finantially devisating?) > > "Mostly" Object Examples: > ~~~~~~~~~~~~~~~~~~~~~~~~~ > * An order placed on a web site. > * The number of holes drilled at the end of the day > by the plant floor production machine. > * The NASDAQ closing prices. > * etc. > > In these cases, DOM might be the better choice, since it > is the entire object that is of interest. In these > cases it's not the parts of the stream that matter, but > the entire stream taken as a whole. Having the buyer > without what he/she purchased dosn't do any good. > > In Between Examples: > ~~~~~~~~~~~~~~~~~~~ > * A drawing. > * NASDAQ 15 min snapshot > > In these cases I was thinking it would be "ideal" to > have an integrated DOM/SAX tool, where the programmer > could have the incoming stream spark SAX events, but > would result in a queryable DOM object. For instance, > the drawing could be drawn as the stream is being > read, but any editing would have to wait untill > the whole drawing has been received. For the NASDAQ > example, having the current ticker, would be nice, > but having a "snapshot" in time accessable via > DOM might be very useful as well. > > Question: Are there any XML parsing tools that take > both of these approaches by allowing _both_ DOM > and SAX. If you didn't register for any events, > you would not be using the SAX part, where, if you > didn't ask for the "DOM" object, the stream > would be discarded after events are fired. This > type of "unified" tool would be great. > > Your thoughts? > > Clark Evans > Clark Evans wrote: > > > > Perhaps the role of namespaces is fundamentally > different in the "stream processing" paradigm > than it is in "object processing" paradigm? > > Could this be the issue underlying the current > debate? I don't know enough on the topic to > say. However, I feel I can help by explaining > my observations about the differences between > the paradigms. > > 1. A tenant of object oriented programming > is encapsulation, data hiding. For stream > processing it is the opposite, data exposure. > > 2. Objects are modified or undergo state change > by invoking methods. Where streams are re-written > or translated by transformations. > > 3. Ideally, an object retains it's identity. > The entire goal of a stream is to merge it's > information with each and every observer; this > is equivalent to identity loss. > > 4. An object has a 1-1 correspondence between its > data and its code. A stream has a 1-M correspondence > between its data and its code. Where the document is > the data, and the code is the observer's > transformation system. > > 5. Objects are finite, they have a boundry. > Streams may be effectively infinite. For > example, a pressure transducer sending water > level measurements may operate continuously > for years! Thus, you can store an entire > object in memory, you may not want to store > an entire stream in memory. > > 6. An object's interface describes a block of > functionality provided. A stream's interface > describes the information conthat it carries. > > 7. An object has one type or class which is > assigned to the data, where a stream can > be classified differently by each and every > observer. This is especially clear if > you read about Arcetectures. > > etc. > > Anyway, I'm not saying that one is better > than the other, just that they are different > and subtly interwoven. For instance, Scenerios > is the study of object interactions as > a stream of events. And SAX is a wonderful > event-driven stream observer object. > > I feel that the key to the success of XML > is to recognize that it is part of a different > paradigm --XML complements existing technology. > As such, it is important to scrutinize the > application of object-oriented idioms to the > new paradigm. > > > > Hope this helps, > > Clark Evans xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Feb 11 18:59:06 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:03 2004 Subject: DOM vs. SAX??? Nah. In-Reply-To: <36C320CB.67415DD1@manhattanproject.com> References: <36C320CB.67415DD1@manhattanproject.com> Message-ID: <14019.10050.858323.327397@localhost.localdomain> Clark Evans writes: > Neet-O. I heard this quote > > > SAX is just a way to populate DOM, it's at lower level. > > a while back and it gave me convulsions. Glad to hear the > real-world experience. *smile* Or, to turn it on its head, DOM is just a way to cache SAX events in memory. Both statements are, of course, equally silly. As part of my SAX documentation from a year ago, I wrote a short piece on this topic: http://www.megginson.com/SAX/event.html All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Feb 11 21:53:12 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:09:03 2004 Subject: The Peace Process: DOM and namespaces... References: <006101be55be$5e549020$c9a8a8c0@thing2> Message-ID: <36C350F3.63C3538D@infinet.com> Bill la Forge wrote: > One of the problems with DOM/namespace is the difficulty of manipulating > the tree while tracking namespace scoping. A partially lazy approach seems to work: > > 1. Do NOT make any namespace transformations while building the tree. > Instead, propagate the xmlns attributes to all children. (Use the inheritance filter.) > > 2. Manipulate the tree as desired. > > 3. Layer additional methods onto the DOM (you can either subclass things or > use static methods) for accessing the unique name based on the xmlns attributes > attached to the Element being accessed. > > 4. When converting from the DOM to SAX events or when creating a document from the > DOM, drop xmlns attributes which are assigned the same value in the parent element. The issue here is using the standard DOM interfaces to do the job. If you subclass the DOM and use this new type you defined to manage namespaces, then you might as well not use the DOM at all because this is a proprietary feature. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Thu Feb 11 22:13:43 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:09:03 2004 Subject: The Peace Process: DOM and namespaces... In-Reply-To: <36C2D232.487283E@infinet.com>; from Tyler Baker on Thu, Feb 11, 1999 at 07:50:58AM -0500 References: <36C2011B.D188E44B@activated.com> <14018.952.766576.431128@localhost.localdomain> <36C20BEC.FEDBA22D@activated.com> <14018.3916.726579.503762@localhost.localdomain> <36C21A27.69968FDC@infinet.com> <14018.49624.990887.76240@localhost.localdomain> <36C2D232.487283E@infinet.com> Message-ID: <19990212091323.A29051@io.mds.rmit.edu.au> On Thu, Feb 11, 1999 at 07:50:58AM -0500, Tyler Baker wrote: > David Megginson wrote: > > > Tyler Baker writes: > > > > > > Preprocess your information, whatever its source. > > > > > > From an entire database. Pass over the entire document tree and > > > prepreprocess everything before actually presenting it to the > > > application. This is not practical. In my limited experience on > > > these matters I have seen this tried before and with horrendous > > > results. Nevertheless, it does not take a computer scientist to > > > see the real world problem with this approach. > > > > That would be silly -- lazy evaluation works fine for this kind of > > thing. I hate to sound stupid, but I still fail to see how Namespaces > > causes any problems at all for someone dynamically generating a > > document from a database -- if you want to use names with a URI part, > > use them; if not, don't. > > Well you have the choice of doing an entire pass over the document and > building a map directly or else try and lazily evaluate things and then > cache the results of the namespace processing. Then for each time you > find a node, you need to look it up in a hashtable somehow. Hashtables > are cheap, but not that cheap if you need to do a table lookup for every > node you process. So iterating over the entire source tree and building > an indexed table may in some circumstances be more efficient than the lazy > approach. > > But you are still faced with the problem of illegal XML Names. If you > write the DOM Document out to an XML file or stream you will be emitting > illegal XML Names unless you have some URI -> prefix hack to get things > back to legal XML. As in expat:

Hello world!

becomes: Hello world! Not sure why the second xmlns:ns0 is there. Maybe it just makes life easier on the user (no need to traverse the parents). Also not sure why the last close tag is . I guess that's why my copy of expat came from the _test_ directory on James's FTP site. :-) Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Feb 11 22:45:48 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:09:04 2004 Subject: The Peace Process: DOM and namespaces... References: <36C2011B.D188E44B@activated.com> <14018.952.766576.431128@localhost.localdomain> <36C20BEC.FEDBA22D@activated.com> <14018.3916.726579.503762@localhost.localdomain> <36C21A27.69968FDC@infinet.com> <14018.49624.990887.76240@localhost.localdomain> <36C2D232.487283E@infinet.com> <19990212091323.A29051@io.mds.rmit.edu.au> Message-ID: <36C35D42.ACF2BF6C@infinet.com> Marcelo Cantos wrote: > On Thu, Feb 11, 1999 at 07:50:58AM -0500, Tyler Baker wrote: > > David Megginson wrote: > > > > > Tyler Baker writes: > > > > > > > > Preprocess your information, whatever its source. > > > > > > > > From an entire database. Pass over the entire document tree and > > > > prepreprocess everything before actually presenting it to the > > > > application. This is not practical. In my limited experience on > > > > these matters I have seen this tried before and with horrendous > > > > results. Nevertheless, it does not take a computer scientist to > > > > see the real world problem with this approach. > > > > > > That would be silly -- lazy evaluation works fine for this kind of > > > thing. I hate to sound stupid, but I still fail to see how Namespaces > > > causes any problems at all for someone dynamically generating a > > > document from a database -- if you want to use names with a URI part, > > > use them; if not, don't. > > > > Well you have the choice of doing an entire pass over the document and > > building a map directly or else try and lazily evaluate things and then > > cache the results of the namespace processing. Then for each time you > > find a node, you need to look it up in a hashtable somehow. Hashtables > > are cheap, but not that cheap if you need to do a table lookup for every > > node you process. So iterating over the entire source tree and building > > an indexed table may in some circumstances be more efficient than the lazy > > approach. > > > > But you are still faced with the problem of illegal XML Names. If you > > write the DOM Document out to an XML file or stream you will be emitting > > illegal XML Names unless you have some URI -> prefix hack to get things > > back to legal XML. > > As in expat: > >

Hello world!

> > becomes: > > xmlns:ns0="www.simdb.com">Hello world! > > Not sure why the second xmlns:ns0 is there. Maybe it just makes life > easier on the user (no need to traverse the parents). > > Also not sure why the last close tag is . I guess that's why > my copy of expat came from the _test_ directory on James's FTP site. > :-) Well, then you no longer have the original document structure you had before but these archane prefixes in your document which make things completely unreadable from a users perspective. I might as well just use Java Object Serialization only for serializing an object tree as it would be faster and be no less understandable to the end-user than all of this automatic prefix creation. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Thu Feb 11 22:54:31 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:04 2004 Subject: The Peace Process: DOM and namespaces... Message-ID: <001601be5610$d0efe240$c9a8a8c0@thing2> From: Tyler Baker >The issue here is using the standard DOM interfaces to do the job. If you subclass the DOM >and use this new type you defined to manage namespaces, then you might as well not use the DOM >at all because this is a proprietary feature. I am afraid I don't understand where proprietary feature comes in, nor the need for subclassing. Lets say I have an off-the-shelf parser (AElfred?) and an off-the-shelf DOM (Docuverse?). I'm writing a program that needs to use the DOM to process documents that use namespaces. So I wrap John Cowan's inheritance filter around the parser and feed the filter to the DOM. Then I write a static methods for fetching a qualified name from an element... public static String getQualified(Element); My application then uses this 5-line method to access the qualified names when it needs them. I then manipulate the tree as needed. When done, I walk the tree, generating SAX events, feed them through an uninherit filter, and compose a document from the result. Yes, my application is proprietary. It should be! But the interfaces are conformant. And the components are all conformant. And it was pretty easy to use all this conformant software to put together an application which pushes a document which uses namespaces through the DOM. Isn't this the real strength of standards? Being able to get off-the-shelf software from multiple vendors, integrate them into an application, and do something real??? I really don't understand your comment. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Thu Feb 11 22:59:37 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:04 2004 Subject: The Peace Process: DOM and namespaces... Message-ID: <001d01be5611$89390200$c9a8a8c0@thing2> From: Tyler Baker >... I >might as well just use Java Object Serialization only for serializing an object tree as it >would be faster and be no less understandable to the end-user than all of this automatic >prefix creation. I really can't take this statement seriously. I'm sorry, but its really a bit much. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Thu Feb 11 23:26:05 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:09:04 2004 Subject: The Peace Process: DOM and namespaces... References: <001601be5610$d0efe240$c9a8a8c0@thing2> Message-ID: <36C366B1.5BB5E4D8@infinet.com> Bill la Forge wrote: > From: Tyler Baker > >The issue here is using the standard DOM interfaces to do the job. If you subclass the DOM > >and use this new type you defined to manage namespaces, then you might as well not use the DOM > >at all because this is a proprietary feature. > > I am afraid I don't understand where proprietary feature comes in, nor the need for subclassing. > > Lets say I have an off-the-shelf parser (AElfred?) and an off-the-shelf DOM (Docuverse?). > > I'm writing a program that needs to use the DOM to process documents that use namespaces. > So I wrap John Cowan's inheritance filter around the parser and feed the filter to the DOM. > Then I write a static methods for fetching a qualified name from an element... > > public static String getQualified(Element); This is not the situation I am talking about. I am talking about if you mutate the DOM object you need to in effect do all of this stuff all over again. This als only works from reading static HTML files. The core situation I am talking about is if you do things like: Element element = document.createElement("foo:bar"); How do you resolve foo to a namespace unless you have some managed context for doing so. > My application then uses this 5-line method to access the qualified names when it needs them. > I then manipulate the tree as needed. When done, I walk the tree, generating SAX events, > feed them through an uninherit filter, and compose a document from the result. Walking an entire tree just to do this will cause problems for situations where performance is critical. Whether you are using C or Java, this is an expensive operation. > Yes, my application is proprietary. It should be! > > But the interfaces are conformant. And the components are all conformant. And it was > pretty easy to use all this conformant software to put together an application which > pushes a document which uses namespaces through the DOM. Yah, but you sacrifice real-world usability by throwing performance out the window. > Isn't this the real strength of standards? Being able to get off-the-shelf software from > multiple vendors, integrate them into an application, and do something real??? Yah standards are nice if they are practical in the real-world. "Namespaces in XML" causes so many headaches with other internet standards that I deem it not practical to use. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Fri Feb 12 01:07:48 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:04 2004 Subject: The Peace Process: DOM and namespaces... Message-ID: <00f101be5623$eb19e720$2ee044c6@arcot-main> Would this problem be solved if: 1. DOM elements and attributes are statically bound to namespaces at creation. 2. Namespace support in the DOM Level 2 spec is provided soon and remains unchanged until recommendation. Whether above two conditions can be met is another question. Regarding the proposal to create a new URI scheme for XML namespaces, this is just an engineering practice and is not subject to approval by W3C because the Namespace spec does not place any constraint on the choice of URI scheme used to declare a namespace. Frankly, I don't see the advantage of using the HTTP URI scheme over XNS (XML NameSpace). Again, here is an example of an XNS URI scheme: expands to: Comments? Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Fri Feb 12 01:45:46 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:09:04 2004 Subject: The Peace Process: DOM and namespaces... In-Reply-To: <36C35D42.ACF2BF6C@infinet.com>; from Tyler Baker on Thu, Feb 11, 1999 at 05:44:18PM -0500 References: <36C2011B.D188E44B@activated.com> <14018.952.766576.431128@localhost.localdomain> <36C20BEC.FEDBA22D@activated.com> <14018.3916.726579.503762@localhost.localdomain> <36C21A27.69968FDC@infinet.com> <14018.49624.990887.76240@localhost.localdomain> <36C2D232.487283E@infinet.com> <19990212091323.A29051@io.mds.rmit.edu.au> <36C35D42.ACF2BF6C@infinet.com> Message-ID: <19990212124521.A7893@io.mds.rmit.edu.au> On Thu, Feb 11, 1999 at 05:44:18PM -0500, Tyler Baker wrote: > Marcelo Cantos wrote: > > > > As in expat: > > > >

Hello world!

> > > > becomes: > > > > > xmlns:ns0="www.simdb.com">Hello world! > > > > Not sure why the second xmlns:ns0 is there. Maybe it just makes > > life easier on the user (no need to traverse the parents). > > > > Also not sure why the last close tag is . I guess that's > > why my copy of expat came from the _test_ directory on James's FTP > > site. :-) > > Well, then you no longer have the original document structure you > had before but these archane prefixes in your document which make > things completely unreadable from a users perspective. I might as > well just use Java Object Serialization only for serializing an > object tree as it would be faster and be no less understandable to > the end-user than all of this automatic prefix creation. I guess my use of the term "user" made my post a little ambiguous. What I meant was "client of the parser", i.e. the person implementing a high-level API to the document. The real end-user (the person clicking buttons on a browser, or integrator writing code in a scripting language) never sees the internal representation of the document. As far as I am concerned, the user never sees anything but the first version above. The second is accessed by tools that give programmatic access to the data, such as a DOM interface. As an example, the document above could be accessed like this: DOMDocument d("

Hello world!

"); DOMNode root := d.root(); String text := root.children("p", "www.simdb.com"); The user is completely oblivious to the background convolutions of using "archane prefixes". This, of course, makes life hard on the implementor, but as a user, I couldn't care less, as long as my life is made easy. (The astute reader may observe that the transformed document above throws away the concept of a current namespace. I think this is bad, since I would like to do this: String text := root.children("p"); and have the root's namespace implied, just as it is in the document. This is more than a just a pedagogical grumble. It has significant ramifications for code reuse. I want to write a function that pulls apart a legal statute and presents it as an HTML document. But I want that function to also work on a fragment that has been inserted into a larger document and wrapped up in a namespace.) Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Fri Feb 12 02:47:46 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:09:04 2004 Subject: The Peace Process: DOM and namespaces... References: <36C2011B.D188E44B@activated.com> <36C26100.8C18CB20@jclark.com> <36C2C650.2CE0F690@activated.com> Message-ID: <36C39100.BE29F846@jclark.com> Yes. Namespace processing doesn't need any information that isn't in the DOM tree. The layering does have a cost, but you can probably make the layer pay for itself by also using it to handle things like: - document order comparison (handy for some select patterns) - expanding entities - hiding the difference between CDATA and other text - merging text nodes - ignoring white-space only text nodes where required by XSL - allowing navigation from an attribute to its owner element Is it acceptable for your XSL processor to mutate the DOM for the source document? If so, there are alternatives that might be worth considering. Rick Ross wrote: > > One of the benefits we are trying to deliver to customers, however, is the > ability to use any parser that offers a standard DOM and SAX > implementations. Could this approach be implemented solely within our > processor-side logic, and rely only on standard implementations of DOM Level > 1 that exist in most of today's XML parsers? > > Rick > > James Clark wrote: > > > > Rick Ross wrote: > > > > > the XSL working draft > > > specification requires namespace support that apparently cannot be > > > implemented effectively if the primary input source is a dynamically built > > > DOM tree. > > > > I can't see this. > > > > Why can't you put a layer on top of the DOM that provides namespace > > processing? For example, you could have an NSNode object that points to > > the DOM Node and a set of prefix bindings (and probably a parent > > NSNode). The NSNode objects will be temporary. You wouldn't have to > > reparse the document, and you don't have to keep two trees in memory. > > You can also provide other things in this layer that help XSL > > performance such as document order comparison. > > > > James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Fri Feb 12 03:19:40 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:09:04 2004 Subject: The Peace Process: DOM and namespaces... References: <00f101be5623$eb19e720$2ee044c6@arcot-main> Message-ID: <36C39D73.49C8CF75@infinet.com> Don Park wrote: > Would this problem be solved if: > > 1. DOM elements and attributes are statically bound to namespaces at > creation. > 2. Namespace support in the DOM Level 2 spec is provided soon and remains > unchanged until recommendation. Maybe except I would rather have a DOM Level 3 where you put a lot of what I would consider "bloat" on the DOM into this category and have Level 2 just do some simple things which DOM Level 1 lacks in terms of being a a nice dumb document model for node iteration. You could add namespaces in here as well. In particular, I would suggest keeping Chapter 3 of the working draft in Level 2 and moving the rest to a Level 2+ or a Level 3. Supporting the DOM is not so practical if it requires monumental programming resources to support it. That is not to say people will not need the features in Chapter 1, Chapter 2, and Chapter 4, but I think that they are not nearly as necessary for most apps as the stuff in Chapter 3 dealing with iterators and filters. > Whether above two conditions can be met is another question. That would require the W3C willing to be flexible to the needs of the developement community at large and not just its members. > Regarding the proposal to create a new URI scheme for XML namespaces, this > is just an engineering practice and is not subject to approval by W3C > because the Namespace spec does not place any constraint on the choice of > URI scheme used to declare a namespace. Frankly, I don't see the advantage > of using the HTTP URI scheme over XNS (XML NameSpace). > > Again, here is an example of an XNS URI scheme: > > > > expands to: > > > > Comments? If I am getting your ideas correct here, this is the exact convention you need to use for maintaining unique names without having to deal with complicated issues lilke namespace defaulting, using attributes for namespace declarations, etc. etc. etc. This would in effect be a formality for maintaining unique names just as there is a formality for generating GUID's. Yah, you can never enforce that people will actually use this formality, but I think this idea that you will be working with documents and schemas that you have no prior knowledge of, is a little far fetched. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Fri Feb 12 05:42:59 1999 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:09:04 2004 Subject: Roll-Your-Own Parsers (was: Re: What Clean Specs Achieve) References: <199902111350.IAA02839@hesketh.net> Message-ID: <36C3BF0E.E7A9BF2E@Eng.Sun.COM> Simon St.Laurent wrote: > > It looks like someone else may be stepping in with possible disruption for > the roll-your-own process: Sun. From the Infoworld cover story > (http://www.infoworld.com/cgi-bin/displayStory.pl?99028.ehsunxml.htm): > > >"We had to let the dust settle and determine what makes sense for XML so > >we have a clear path," said Nancy Lee, product manager for XML at Java > >Software, a Sun division in Mountain View, Calif. > > > >The company's next step is to provide a standard Java API to support XML, > >Lee said. The API will go through the Java Community Process, a multivendor > >standards process for Java technologies. > > Not to pick on Sun, but it's still pretty unclear how open their process is > really going to be. To the extent I have much say, quite open ... but the story about how open it's _supposed_ to be is on-line: http://developer.java.sun.com/developer/jcp/index.html Keep an eye out there ... say, next week. And I'll extend a personal invite to anyone on XML-DEV to keep an eye on this particular process, and a request to help make this a success for everyone!! Drop me a line (personally -- I have little time to keep up with XML-DEV)) if you have issues with how it works out. > Do we really need another 'multivendor standards > process' to tell us all what to do our information? How about building on > what we've already got, and keeping it at least as open (very for SAX, not > as open for DOM) as it's been so far? What did you think that "clear path" was? ;-) > With any luck, we'd be able to keep the bloat to a minimum, at least. That's a key goal for me, and most folk at Sun who've looked at the issues. There are some holes to fill in in the SAX and DOM areas, but a "core" API must be small and easily built on top of ... be it MDSAX or SAX II or whatever. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Fri Feb 12 07:05:54 1999 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:09:04 2004 Subject: Slowness of JDK 1.1.x String.intern() [was Re: SAX, Java, and Namespaces ] References: <3.0.32.19990205101910.00bdf210@pop.intergate.bc.ca> Message-ID: <36C3D14B.71008189@Eng.Sun.COM> Tim Bray wrote: > > At 10:12 AM 2/5/99 -0800, Jeff Greif wrote: > >JDK 1.1.7 intern is native, but is slow because it first converts the > >characters in the string [to a canonical form] No comment ... that's not my code ... ;-) > Actually, the real reason that most XML parsers will *never* use > built-in intern is because they probably have the name available in a > character array, and can go look things up in the handcrafted > table without String-i-fying it - thus skipping several steps > of work that a built-in intern is going to have to do. E.g. Lark's > symbol table is a double array, storing both the character-array > and String version of each name - you lookup based on the > character array and return the string if it's already there. The > point is that you call new String() only once per unique name. This gives "per-parse" uniqueness, which is valuable to a fair degree beyond the performance win of avoiding allocating a new string. However, Sun's package currently goes one step further and actually interns that string. It's such a small cost (on top of the cost to check that array-to-string cache in the first place) that it's barely measurable. (Anyone try "java -Xrunhprof:cpu=samples ..." on JDK 1.2/SPARC?) That provides "per-VM" uniqueness which has turned out to be handy for things like stylesheet processing -- comparing strings in the stylesheet and source document is quite fast, and that does add up to a performance difference in template matching. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Fri Feb 12 08:49:19 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:09:04 2004 Subject: Slowness of JDK 1.1.x String.intern() [was Re: SAX, Java, and Namespaces ] References: <3.0.32.19990205101910.00bdf210@pop.intergate.bc.ca> <36C3D14B.71008189@Eng.Sun.COM> Message-ID: <36C3E9E9.173CBB15@infinet.com> David Brownell wrote: > Tim Bray wrote: > > > > At 10:12 AM 2/5/99 -0800, Jeff Greif wrote: > > >JDK 1.1.7 intern is native, but is slow because it first converts the > > >characters in the string [to a canonical form] > > No comment ... that's not my code ... ;-) > > > Actually, the real reason that most XML parsers will *never* use > > built-in intern is because they probably have the name available in a > > character array, and can go look things up in the handcrafted > > table without String-i-fying it - thus skipping several steps > > of work that a built-in intern is going to have to do. E.g. Lark's > > symbol table is a double array, storing both the character-array > > and String version of each name - you lookup based on the > > character array and return the string if it's already there. The > > point is that you call new String() only once per unique name. > > This gives "per-parse" uniqueness, which is valuable to a fair > degree beyond the performance win of avoiding allocating a new > string. > > However, Sun's package currently goes one step further and actually > interns that string. It's such a small cost (on top of the cost > to check that array-to-string cache in the first place) that it's > barely measurable. (Anyone try "java -Xrunhprof:cpu=samples ..." on > JDK 1.2/SPARC?) This is what I do in an XML parser as well. The costs would only be relatively high if you had a only one instance of an element type for each element in the document. This in the real world will never happen as you will instead of have lots of repeated element and attribute Names which can be cached and interned the first time. > That provides "per-VM" uniqueness which has turned out to be handy > for things like stylesheet processing -- comparing strings in the > stylesheet and source document is quite fast, and that does add > up to a performance difference in template matching. This is very true. Some DOM implementations such as Docuverse's also do this for the DOM tree. You have a relatively low performance cost for interning Names in a document, but you could possibly get huge benefits when doing node iteration. As of JDK 1.1.7 the String.equals() method is now something of the form: public boolean equals(Object o) { if (s == this) return true; String s = (String)o; if (s.length != length) return false; // Do character matching } Actually, I think just about all DOM implementations in Java that I am aware of intern Names so a call to Node.getNodeName() will always return an interned string. It would be nice for applications if SAX stated that all Names are presented to the DocumentHandler interface as interned strings as Names are nothing more than symbols anyways and should be treated as such, with of course the exception of the weirdness of namespace declaration names appearing as attribute names (e.g. "xmlns:" + some prefix name)". Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Fri Feb 12 09:42:12 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:09:04 2004 Subject: DOM vs. SAX??? Nah. (was RE: Storing Lots of Fiddly Bits (was Re: What is XML for?) Message-ID: <030b01be566b$2dc16110$5402a8c0@oren.capella.co.il> John Cowan wrote: >Oren Ben-Kiki wrote: > >> [T]he >> DOM specs should provide a standard way to apply a SAX visitor to a DOM >> (sub)tree. A parser would be just a special case of an application which has >> an internal "virtual DOM tree" and doesn't provide random access to it. > >See http://www.ccil.org/~cowan/XML/DOMParser.java for code that does >just this: it walks a org.w3c.dom.Document and fires SAX events >based on its contents. I know it is available - just as there are tools to build a DOM tree from SAX events. I'd just like to see SAX being given an official standing _within_ the DOM specs as "the" DOM visitor interface. Ideally, SAX would be part of the DOM specs. Share & Enjoy, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eisen at pobox.com Fri Feb 12 10:02:35 1999 From: eisen at pobox.com (Jonathan Eisenzopf) Date: Mon Jun 7 17:09:05 2004 Subject: trying to understand the roll of XML References: Message-ID: <36C3FC2F.7997C900@pobox.com> Matt Sergeant wrote: > Markus Weltin wrote: > > > Is it common to produce XML docs from database quires, > > and then convert them to HTML pages? > > Be careful with this. There's no reason to query a database, output XML > and convert to XML all on the server. Why not? That's an example of using XML as a glue between a database and HTML. > > > One warning I will give to users here is: Don't convert XML to HTML on > the server. Unless performance really isn't important. Or you can > guarantee that most of your browsers will be 5.0 browsers and you only > need to do the conversion for a few users. That's a lesson I learned the > hard way this week. Unfortunately I think a lot of people will make this > mistake since there are becoming more and more tools on the market to do > just this (e.g. IBM's Java servlet XSL converter). > Whoa, careful there. I think you must be talking about dynamically generating HTML from XML via a CGI-like script. This has merit, but there are other scenarios where converting XML to HTML on the server-side is benefitial. For instance, there's no reason why you couldn't pre-generate HTML from XML. This would save processing time on the client and server. Also, there are instances where processing XML dynamically is easier and more efficient. As an example, I recently worked on a project that collects news headlines from internetnews.com and puts them into a compact summary which web sites then use on their site. Since news is generally time sensitive, the summary must be generated regularly. Previously, each client site was querying internetnews.com every 5 minutes or so. Once the page was retrieve (around 30k usually), the script ripped the headlines out of the HTML and generated a news summary which users then included on their homepage. There were several problems with this: 1. at around 30,000 registered sites, they were concerned that resources were being diverted by the potentially 30,000 clients who were querying the page every few minutes. 2. they were embedding non-HTML tags in HTML 3. the embedded tags did not contain enough information to categorize the news The answer was to use XML as a glue between the news Web server and the clients who want to display news headlines on their sites. Instead of each client retrieving the news page every few minutes, the headlines are now gathered, categorized, and XMLified by a separate server. Now the clients download the headlines in a more compact and useful XML format from a different server. The benefits are: 1. significantely reduced bandwith utilization 2. use of resources is diverted away from www.internetnews.com 3. the XML format is easy to understand. it would be easy to convert to another format or to write a separate client. We can pre-generate or dynamically generate the news summaries in HTML, DHTML, and Javascript. XML has been very helpful to make this transition easier and will also allow for a higher level of extensibility in the future. You can see the results at: http://www.webreference.com (left side) There are also some examples at: http://www.webreference.com/headlines/nh/examples You can register to use the news harvester at: http://www.webreference.com/headlines/nh/ There are multiple news categories, here's a few URLs to the resulting XML which is generated every 5 minutes: http://headlines.internet.com/internetnews/top-news/news.xml http://headlines.internet.com/internetnews/bus-news/news.xml Jonathan. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Feb 12 14:52:19 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:05 2004 Subject: The Peace Process: DOM and namespaces... References: <001601be5610$d0efe240$c9a8a8c0@thing2> Message-ID: <36C43FED.6853112@locke.ccil.org> Bill la Forge wrote: > I'm writing a program that needs to use the DOM to process documents that use namespaces. > So I wrap John Cowan's inheritance filter Namespace filter, that is. The inheritance filter (as Bill knows but some may not) processes inheritable attributes like xml:lang and xml:space. > around the parser and feed the filter to the DOM. That assumes that the DOM implementation blindly incorporates anything the SAX source feeds it in the way of element names, and doesn't do further checking. (A safe assumption, I bet.) -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Feb 12 15:00:18 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:05 2004 Subject: DOM vs. SAX??? Nah. (was RE: Storing Lots of Fiddly Bits (was Re: What is XML for?) In-Reply-To: <030b01be566b$2dc16110$5402a8c0@oren.capella.co.il> References: <030b01be566b$2dc16110$5402a8c0@oren.capella.co.il> Message-ID: <14020.16691.128766.665782@localhost.localdomain> Oren Ben-Kiki writes: > I know it is available - just as there are tools to build a DOM > tree from SAX events. I'd just like to see SAX being given an > official standing _within_ the DOM specs as "the" DOM visitor > interface. Ideally, SAX would be part of the DOM specs. I'd be surprised to see that happen: not only are the two a little out of sync on some of the fluff stuff (like CDATA section boundaries), but all SAX callbacks would need an additional argument supplying a pointer to the DOM node being visited. It hardly makes sense to have startElement(String name, org.xml.sax.AttributeList atts) when you could simply have startElement(org.w3c.dom.Element element) All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Fri Feb 12 16:15:20 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:05 2004 Subject: DOM vs. SAX??? Nah. (was RE: Storing Lots of Fiddly Bits (was Re: What is XML for?) Message-ID: <000c01be56a1$f3db1fe0$c9a8a8c0@thing2> From: David Megginson >... It hardly makes sense to have > > startElement(String name, org.xml.sax.AttributeList atts) > >when you could simply have > > startElement(org.w3c.dom.Element element) The advantage of SAX is its independence from DOM, as that allows for the development of SAX components (filters) which can be used for 1. Preprocessing events before they are used to build the DOM; 2. DOM construction; 3. Output formatting from a DOM or without a DOM; 4. Simple document transformations. What I'm more inclined to do in this SAX/DOM marriage, is to create an alternative to the SAX helper class, AttributeListImpl, which also implements public Attr setAttribute(Attr newAttr); and public Attr removeAttribute(String name); and public Attr getAttribute(String name); xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Fri Feb 12 16:21:02 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:09:05 2004 Subject: Slowness of JDK 1.1.x String.intern() [was Re: SAX, Java, and Namespaces ] References: <3.0.32.19990205101910.00bdf210@pop.intergate.bc.ca> <36C3D14B.71008189@Eng.Sun.COM> Message-ID: <36C453CB.21D02422@mecomnet.de> (NB: I'm not a javist, so please forgive a silly question...) What are the prospects of a form of intern which partitions the interned string cache? if this is done, then the entire issue of namespaces disappears for applications: they never need to "look inside" the string. A namespace-aware parser parses to partitioned caches, an non-aware parser parses to a single. The string "looks" ok from for the respective worldview without further massaging, should one need to use it. If the application wishes to intern strings it simply observes the same discipline. David Brownell wrote: > > ... > > This gives "per-parse" uniqueness, which is valuable to a fair > degree beyond the performance win of avoiding allocating a new > string. > > However, Sun's package currently goes one step further and actually > interns that string. It's such a small cost (on top of the cost > to check that array-to-string cache in the first place) that it's > barely measurable. (Anyone try "java -Xrunhprof:cpu=samples ..." on > JDK 1.2/SPARC?) > > That provides "per-VM" uniqueness which has turned out to be handy > for things like stylesheet processing -- comparing strings in the > stylesheet and source document is quite fast, and that does add > up to a performance difference in template matching. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Fri Feb 12 16:22:49 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:05 2004 Subject: The Peace Process: DOM and namespaces... Message-ID: <001101be56a2$f6a5bc20$c9a8a8c0@thing2> From: John Cowan >> I'm writing a program that needs to use the DOM to process documents that use namespaces. >> So I wrap John Cowan's inheritance filter > >Namespace filter, that is. The inheritance filter (as Bill knows >but some may not) processes inheritable attributes like xml:lang >and xml:space. Sorry John, but I was actually thinking of a cross between the two, where all xmlns attributes were inherited. The advantages here are: 1. You can slip namespaces right past anything that isn't namespace aware; 2. You can manipulate the DOM without worries about scope; 3. You can strip the inherited values off on the output side, and have something reasonably readable and predictable, with all the original xmlns attributes preserved. (The names chosen may actually be meaningful to the author, mm?) When you create new elements in such a DOM, the APPLICATION does need to be aware of which namespace it is using, and supply the appropriate xmlns attribute. That doesn't seem to be too much to ask. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Fri Feb 12 16:53:23 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:09:05 2004 Subject: DOM vs. SAX??? Nah. (was RE: Storing Lots of Fiddly Bits (was Re: What is XML for?) References: <000c01be56a1$f3db1fe0$c9a8a8c0@thing2> Message-ID: <36C45AE1.F6906A7E@manhattanproject.com> Bill la Forge wrote: > > The advantage of SAX is its independence from DOM, > as that allows for the development of SAX components > (filters) which can be used for > 1. Preprocessing events before they are used to build the DOM; > 2. DOM construction; > 3. Output formatting from a DOM or without a DOM; > 4. Simple document transformations. > > What I'm more inclined to do in this SAX/DOM marriage, > is to create an alternative to the SAX helper class, AttributeListImpl, > which also implements > public Attr setAttribute(Attr newAttr); > and > public Attr removeAttribute(String name); > and > public Attr getAttribute(String name); > This is a nice idea. Factor out the common, helper objects used by both the document-oriented DOM and the event-oriented SAX. This would be pretty. It would allow a marriage, but yet keep their independence (which is the key to any successful marriage). :) Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Fri Feb 12 17:31:00 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:05 2004 Subject: DOM vs. SAX??? Nah. (was RE: Storing Lots of Fiddly Bits (was Re: What is XML for?) Message-ID: <004a01be56ad$1d422e40$2ee044c6@arcot-main> >I know it is available - just as there are tools to build a DOM tree from >SAX events. I'd just like to see SAX being given an official standing >_within_ the DOM specs as "the" DOM visitor interface. Ideally, SAX would be >part of the DOM specs. Oren, Its a nice idea but it conflicts with at least one DOM API requirement: JavaScript support. Predicate Functors can be contrived with JavaScript but not multifacet callback interfaces like SAX. Don Park xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Fri Feb 12 18:22:20 1999 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:09:05 2004 Subject: Slowness of JDK 1.1.x String.intern() [was Re: SAX, Java, and Namespaces ] References: <3.0.32.19990205101910.00bdf210@pop.intergate.bc.ca> <36C3D14B.71008189@Eng.Sun.COM> <36C453CB.21D02422@mecomnet.de> Message-ID: <36C46F22.C9B616FE@eng.sun.com> james anderson wrote: > > (NB: I'm not a javist, so please forgive a silly question...) Language is so mutable ... "javist" indeed! ;-) > What are the prospects of a form of intern which partitions the interned > string cache? I'd say pretty low, since this is so readily done in application code. In fact that's pretty much what Tim described: just maintain a cache mapping char arrays to the corresponding strings. Each cache would be one partition. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Fri Feb 12 18:48:27 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:09:05 2004 Subject: RDF, Namespaces, and Versioning? Message-ID: <87256716.00671FF4.00@d53mta03h.boulder.ibm.com> >As far as Namespaces is concerned, the namespace URI is a black box -- >it doesn't point to anything, and it doesn't mean anything. The >creator, however, is free to add internal structure: > > http://www.megginson.com/ns/business/1999-01-29/ > http://www.megginson.com/ns/business/1999-02-09/ > >etc. > But that kind of raises issues regarding extensibility, right? How would you extend a previous version, such that the old one remained intact while the new one was a superset, without having to have them use two different prefixes, in a situation where they both might have to coexist? I haven't thought it out, so maybe its simple, but it seems an obvious issue for us to thrash on for a couple weeks :-) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Feb 12 19:00:54 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:05 2004 Subject: InheritanceFilter improved Message-ID: <36C47A26.6E2B31D4@locke.ccil.org> Two new methods: void namespaceSupport(boolean) forces all namespace attributes (xmlns and xmlns:*) to be automatically inherited iff the argument is true. boolean isInheritable(name) returns true if the argument is an inheritable attribute: either a namespace attribute iff namespace support is on, or an explicitly declared attribute, or xml:lang or xml:space. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Fri Feb 12 19:01:11 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:09:05 2004 Subject: Roll-Your-Own Parsers (was: Re: What Clean Specs Achieve) Message-ID: <87256716.006846FA.00@d53mta03h.boulder.ibm.com> >>There will always be a tradeoff between code size, >>performance and conformance to the spec. We have taken the same approach: >>for XML which might go outside our environment or some in from outside, we >>use a heavyweight parser with full validation. But where it's "behind the >>covers" we use a homegrown (tiny, nonconformant) parser and just check the >>structures a few times during design, with a validating parser. > >If we could work with parser layers rather than parsers, this might become >a lot easier to manage. We could just turn on the parts we need and turn >off the ones we don't. We have taken that approach with our 'Version 2" parsers, Java and C++. They are pretty well layered and pluggable. Don't plug in a validation handler and you won't do any validation work. Don't plug in an entity handler, and you won't get any entity information, etc... Basically we've just extended the concept of a SAX-like handler all the way into the core of the parser. It allows both for extensibility by rolling your own handler, and for the client who is putting together a particular type of parser configuration to tell the lowest level of the parser "do the least work possible for this group of things, since I'm not even interested". Though, relative to the original conversation, despite allowing for better scalability and optimization in the field according to need, it does increase the complexity of the parser itself in some ways. BTW, the C++ version should hopefully hit Alphaworks before too much longer. We are on our 4th or 5th internal release and the next weeks will be 'making the last details work' part of the effort. I can't say when it will get out there, since I dunno about such things (I'm just the measly author :-), but it should be relatively soon. In terms of the external interfaces to the client code, it pretty much matches our version 2 Java parser architecture, though internally it is quite different. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Fri Feb 12 20:01:58 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:09:05 2004 Subject: Ms. Manners Message-ID: <87256716.006DD80E.00@d53mta03h.boulder.ibm.com> Geez. I went away for a month to write a parser, and I come back and what was a model of civil discussion has kind of deteriorated. Sorry if that has already been pointed out and I'm just beating a dead subsystem, but having popped out one day and popped back a month or so later it really is striking how much the atmosphere of this list has changed. Are we heading for civil war in the XML world here? Hopefully we won't have to have any NATO air strikes or anything :-) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clovett at microsoft.com Fri Feb 12 20:31:32 1999 From: clovett at microsoft.com (Chris Lovett) Date: Mon Jun 7 17:09:05 2004 Subject: The Peace Process: DOM and namespaces... Message-ID: <2F2DC5CE035DD1118C8E00805FFE354C087448A3@RED-MSG-56> An interesting thread. First, the DOM committee is addressing this issue this week. IMHO the degree in which XML namespaces succeed will determine the breadth and depth of the success of XML in general, and not just XSL - so I eagerly await what the DOM committee comes up with. We have a namespace implementation in our DOM which we are shipping in IE5, and I think IBM and Sun also have a solution. I didn't fully understand all the arguments presented here - but our experience is that although namespaces are not trivial to implement in an efficient manner, it is doable. In the IE5 DOM implementation we expose namespaces via new properties on the node, "namespaceURI", "basename" and "prefix" and we added a new createNode method to DOMDocument that allows you to specify a namespaceURI for that node. Processing namespaces during document load was not too hard to implement. The most difficult thing was cut & paste and making sure the saved document had all the right xmlns attributes. We can probably improve our implementation by minimizing the number of xmlns attributes we generate. I did some experiments with hamlet.xml: 1) no namespaces 2) put default namespace on the root PLAY tag 3) put a prefix "p:" on all tags 4) added a nested prefix "q:" on all the tags in ACT 1. 5) gave all 5 acts a different prefix. When looking at megabytes/second during load there was about a 1% delta between 1 and 2 and a 7% delta between 1 and 3, but 3,4 and 5 were pretty much indistinguishable. So unless you have hundreds or thousands of different namespaces in one document (which I don't anticipate to be the norm), the performance is not too bad. - Chris. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Fri Feb 12 20:34:47 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:09:05 2004 Subject: Advantages of XML and SGML Message-ID: <36C48EF1.DD07B9F7@manhattanproject.com> Would you all correct me where I'm wrong here? Thanks! Clark -------- Original Message -------- Subject: Re: Advantages of XML and SGML Date: Fri, 12 Feb 1999 20:27:28 +0000 From: Clark Evans To: Susan Barron Newsgroups: comp.text.xml References: <36C423D5.EB0B6C40@lmco.com> Susan Barron wrote: > > We have been using SGML for several years and are closely watching the > trend towards XML. Could someone please give me some examples of why > you would use XML over SGML. I know that XML is a subset of SGML. I > believe there must be some things that can be done in SGML that are not > possible in XML. Conversely, there must be somethings that XML does > better than SGML. Thank you. Since minimization is allowed in SGML, this creates situations where the meaning of document can have multiple syntatic interpretations. For instance: Can have two syntatic intererpretations: OR The DTD is required for the parser to figure out which one is the correct interpretation of the input. As such, an SGML document must have one_and_only_one DTD to resolve these syntatic ambiguities. XML restricts the syntax by eliminating these minimizations. Thus, all documents have one and only one syntatic interpretation. This dramatically reduces the complexity of the parser. Thus, a parser can be simpler to implement, and a DTD is _not_ required for parsing. This lets the DTD be used for a 100% semantic role, which is much more interesting for describing data! This is great beacuse it allows a document to conform to more than one DTD at the same time, _without_ requiring a "mother" DTD that merges all of the DTD's together. This is called "Architectures". It allows multiple meanings for the same document, depending upon the observer without requireing all of the possible observers to get together and specify a "united" DTD. However, this added flexibility, comes at a price: The syntax becomes much more restrictive. Therefore, For computer program <=> computer program communication XML is the ideal structure to use. Since it allows multiple subscribers to have their own interpretation of a data stream without changing the publishers. For human => computer communication SGML is will probably still remain as the prefered structure. The minimization features are very valueable when a human is the author of the document. Also, there is nothing saying you can't use both! If a human is going to write it by hand, perhaps SGML is better, then you can have JClark's SP use the DTD to resolve the ambiguities and produce the XML document that can be introduced into the corporate "xml bus" Hope this helps! Clark Evans xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Fri Feb 12 22:17:17 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:05 2004 Subject: Roll-Your-Own Parsers In-Reply-To: <87256716.006846FA.00@d53mta03h.boulder.ibm.com> Message-ID: <199902122217.RAA03745@hesketh.net> At 11:58 AM 2/12/99 -0700, roddey@us.ibm.com wrote: >We have taken that approach with our 'Version 2" parsers, Java and C++. >They are pretty well layered and pluggable. Don't plug in a validation >handler and you won't do any validation work. Don't plug in an entity >handler, and you won't get any entity information, etc... Basically we've >just extended the concept of a SAX-like handler all the way into the core >of the parser. It allows both for extensibility by rolling your own >handler, and for the client who is putting together a particular type of >parser configuration to tell the lowest level of the parser "do the least >work possible for this group of things, since I'm not even interested". This sounds (and looks) promising. I'm not clear exactly _how_ modular it is, though. Can I take info from the SAX parser, abuse (or nicely process) it, and feed it back into the DOM tree builder? Or am I stuck to choosing validating/non-validating and DOM/SAX? Is the 'SAX-like handler' really SAX with extras, or is it incompatible? Reading the API is kind of weird. I'd like to know what this 'scanner' critter is doing too. Fun stuff, though! Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Sat Feb 13 10:02:51 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:09:06 2004 Subject: Fw: DOM vs. SAX??? Nah. (was RE: Storing Lots of Fiddly Bits (was Re: What is XML for?) Message-ID: <00a901be5737$4ed9b1a0$5402a8c0@oren.capella.co.il> Don Park wrote: >I wrote: >>I know it is available - just as there are tools to build a DOM tree from >>SAX events. I'd just like to see SAX being given an official standing >>_within_ the DOM specs as "the" DOM visitor interface. Ideally, SAX would >be >>part of the DOM specs. > >Oren, > >Its a nice idea but it conflicts with at least one DOM API requirement: >JavaScript support. Predicate Functors can be contrived with JavaScript but >not multifacet callback interfaces like SAX. I don't quite see why, but I admit to being hazy on the fine points of the DOM API. How do these "Predicate Functors" effect the API of applying a visitor to a constructed DOM tree? David Megginson raised another objection: >... It hardly makes sense to have > > startElement(String name, org.xml.sax.AttributeList atts) > >when you could simply have > > startElement(org.w3c.dom.Element element) Well, applying a visitor to the tree does, classically, accept the tree node. But in this particular case, it makes more sense to give the visitor a restricted view of the node. This allows us to gain all the benefits of SAX while still marrying the two interfaces. For the cases where access to the node is useful, one could add a method to the SAX API: org.w3c.dom.Element getElement(); Which would return null if the visitor is applied to a "virtual DOM", such as in a parser, and would return the element, in case it is applied to a real DOM tree. Hmmm - this would introduce a dependency between the SAX API and the DOM API, which we really want to avoid. In some languages (C++), this can be avoided by defining 'org.w3c.dom.Element' to be an unknown "external class" without creating a dependency. In Java we could just return an Object... I'm certain this is solvable at some level. Share & Enjoy, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Sat Feb 13 14:28:57 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:06 2004 Subject: SAX2 (was Re: DOM vs. SAX??? Nah. ) Message-ID: <000601be575c$82864d40$c9a8a8c0@thing2> From: Oren Ben-Kiki >Well, applying a visitor to the tree does, classically, accept the tree >node. But in this particular case, it makes more sense to give the visitor a >restricted view of the node. This allows us to gain all the benefits of SAX >while still marrying the two interfaces. For the cases where access to the >node is useful, one could add a method to the SAX API: > >org.w3c.dom.Element getElement(); > >Which would return null if the visitor is applied to a "virtual DOM", such >as in a parser, and would return the element, in case it is applied to a >real DOM tree. Hmmm - this would introduce a dependency between the SAX API >and the DOM API, which we really want to avoid. In some languages (C++), >this can be avoided by defining 'org.w3c.dom.Element' to be an unknown >"external class" without creating a dependency. In Java we could just return >an Object... I'm certain this is solvable at some level. (I ramble a bit in what follows, but I do get back to the topic at hand, eventually. --Bill) For SAX2, it would be great to pass objects representing SAX events instead of method calls. The overhead might not be any greater, as the parser could just have one of each kind of event and reuse them. Backward compatibility could be achieved through the use of a conversion filter, allowing existing SAX applications to work with new parsers. There are two big advantages here in terms of extensibility: 1. It would be easy to extend the interfaces for various SAX event objects, passing additional data without creating problems for an application which is not expecting it. 2. Additional events could be passed which the application could ignore. All SAX events might have a uniqued field which names the event type. The parser interface could then provide a query method something like public boolean instanceOf(String eventTypeName, String interfaceName) which could be used when the implementation language doesn't support the instanceof operator. An application could then route events based on the uniqued eventType field, using the instanceOf method or operator only when a new type of event is encountered. We could then easily support a negotiated protocol between the parser and the application. The parser provides a list of all the eventType fields it supports; the application then indicates which events it is interested in. Under these conditions, it would be easy to provide for a very close integration between SAX and DOM. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dineshv at email.msn.com Sat Feb 13 20:57:14 1999 From: dineshv at email.msn.com (Dinesh Vadhia) Date: Mon Jun 7 17:09:06 2004 Subject: XML Data Structure ... Message-ID: <000b01be5794$57b4de00$a94d95c1@thinkpad> XML is based on a hierarchical (tree) structure ... Does any one know precisely which kind of tree structure ie. balanced etc. etc. ... Dinesh -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990213/e623bd9b/attachment.htm From James.Anderson at mecomnet.de Sat Feb 13 20:59:34 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:09:06 2004 Subject: ? DOM committee to "address this issue"? [Re: The Peace Process: DOM and namespaces...] References: <2F2DC5CE035DD1118C8E00805FFE354C087448A3@RED-MSG-56> Message-ID: <36C5E8E8.D8F57500@mecomnet.de> (in response to a note which appeared on xml-dev) After reading remarks to the effect, that all principal parser providers had released, or would soon release, parser versions with namespace support, I retrieved several versions to "see what I could see." four to be exact: xml4j, msxml/dcxml, sun, and oracle. I observed, in effect, four different language bindings for the (as yet unspecified) model for namespaces and decided, well, to go back to work. Upon seeing this message, however, I would like to note that it would be a welcome contribution from the respective implementers if they would offer some account as to why their respective interface is the way to go. In general, I was perplexed that the namespace URI appears to be on its way to becoming a property of the node (element or attribute), rather than of the node's name. What processing scenarios are people working eith for which this interface makes sense? : ? wholesale transplantation of "nodes" from one namespace to another (ie independent of the names)? ? matching nodes based on uri without regard to names? ? if one mutates a nodes namespace, does this affect the namespace of explicitly qualified attribute names? ? what sort of uri mutations are proposed and how is coherence maintained between the element name, attribute names (uri's and prefixes) where all are explicitly asserted. ? i didn't find facilities in all cases to operate on nodes with universal names, but where i did, i was uncertain how does one would use use something like getAttribute(uri x name) or getInheritedAttribute(uri x name) to find attributes in "no namespace"? I was also disappointed, that the interfaces, as best i could surmise, offer as yet not quite interoperable mechanisms for working with namespaces. (watch out, here comes another one of my tables; please set your windows on "wide" :) msxml/dcxml (betatwo) sun (ea2) oracle (1_0_0_3) ibm* (2.0.0) encoded name: Node.getNodeName, Node.getNodeName NSAttribute,NSElement.getQualifiedName Namespace.getName local part: IXMLDomNode.getBaseName, NamespaceScoped.getLocalName NSAttribute,NSElement.getLocalName Namespace.getNSLocalName namespace URI: IXMLDomNode.getNamespace, NamespaceScoped.getNamespace NSAttribute,NSElement.getGetNamespace Namespace.getNSName prefix: IXMLDomNode.getPrefix, ? ? ? universal name: ? ? NSAttribute,NSElement.getExpandedName Namespace.getUniversalName expanded name: ? ? ? Namespace.createExpandedName (* with whom I commisserate for having taken appendix a seriously :) Looks like the DOM comittee is in for an interesting week... How about an addenda to dom-level-1 which is just the java languange binding for the (as yet unspecified) namespace-aware name/attribute/element interface, also to include some arguments for the decision to model names within the application either as atomic or as explicit (string X string).... ? Chris Lovett wrote: > > An interesting thread. First, the DOM committee is addressing this issue > this week. IMHO the degree in which XML namespaces succeed will determine > the breadth and depth of the success of XML in general, and not just XSL - > so I eagerly await what the DOM committee comes up with. > > We have a namespace implementation in our DOM which we are shipping in IE5, > and I think IBM and Sun also have a solution. I didn't fully understand all > the arguments presented here - but our experience is that although > namespaces are not trivial to implement in an efficient manner, it is > doable. > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Sat Feb 13 22:07:50 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:06 2004 Subject: Ms. Manners Message-ID: <007901be579d$1b90f7c0$2ee044c6@arcot-main> >Geez. I went away for a month to write a parser, and I come back and what >was a model of civil discussion has kind of deteriorated. Sorry if that has >already been pointed out and I'm just beating a dead subsystem, but having >popped out one day and popped back a month or so later it really is >striking how much the atmosphere of this list has changed. Are we heading >for civil war in the XML world here? Hopefully we won't have to have any >NATO air strikes or anything :-) I believe Tim mentioned something about NATO working with the Pentagon to take out the XML-DEV mailing list server in UK if the rebels go out of control. No mention on whether the plan involves air strikes. I think we are still one big happy family. Best, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Sat Feb 13 23:03:51 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:09:06 2004 Subject: ? DOM committee to "address this issue"? [Re: The Peace Process: DOM and namespaces...] References: <2F2DC5CE035DD1118C8E00805FFE354C087448A3@RED-MSG-56> <36C5E8E8.D8F57500@mecomnet.de> Message-ID: <36C603BF.25199213@infinet.com> james anderson wrote: > How about an addenda to dom-level-1 which is just the java languange binding > for the (as yet unspecified) namespace-aware name/attribute/element interface, > also to include some arguments for the decision to model names within the > application either as atomic or as explicit (string X string).... ? > > Chris Lovett wrote: This is an excellent idea. If namespaces are to be supported in the DOM, it should certainly be in level 1 if the W3C intends on namespaces being as ubiquitous as HTML. I personally prefer MS's approach as it gives you the ability to create namespace aware elements and attributes at the API level, while the SUN, Oracle, and I think IBM implementations do not as you can only use Document.createElement(String name). Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Sun Feb 14 11:45:55 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:06 2004 Subject: XML Data Structure ... In-Reply-To: <000b01be5794$57b4de00$a94d95c1@thinkpad> References: <000b01be5794$57b4de00$a94d95c1@thinkpad> Message-ID: <14022.46654.796482.306664@localhost.localdomain> Dinesh Vadhia writes: > XML is based on a hierarchical (tree) structure ... Does any one > know precisely which kind of tree structure ie. balanced > etc. etc. ... XML provides an external representation of any logical tree structure; it's not an algorithm. BTrees, etc., are algorithms for using physical tree structures for modelling logically linear information. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Sun Feb 14 20:49:33 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:06 2004 Subject: Suggestions for clean-specs (was: Clean specs) Message-ID: Hi Everyone, I'm fairly new to this list but have had some experience with implementing XML applications. Having recently written a short introductory training course on XML, I have a few comments on the XML 1.0 specification. First, overall, I think the XML 1.0 specification is usable and fairly clear. I do not have enough experience yet to point out serious ambiguities or omissions in it. Although, like any product, these types of things often must be resolved through widespread dissemination and time. Having said the above, I do have a few suggestions: 1. While there are some examples in the spec, there should be an example for every topic. Consistency here is important. 2. The ordering of information seems haphazard. I don't understand the design of the current ordering - it just seems to try and cover all the topics without regard to logical flow or a building block approach. 3. The complexity of all topics is not equal, although in the current spec you feel as if they are treated as such. Topics known to be more complex should provide extra examples and more in-depth explanation. After having said the above, one could argue that these suggestions should be taken by book authors and not the specification authors. While that may be the case, these suggestions would make the specs better server their audience. Best wishes, - Mike ----------------------------------------------- Michael C. Daconta Author of Java 2 and JavaScript for C/C++ Programmers Author of C++ Pointers and Dynamic Memory Management Sun Certified Java Programmer and Developer http://www.gosynergy.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mirja.hukari at citec.fi Mon Feb 15 12:24:51 1999 From: mirja.hukari at citec.fi (Mirja Hukari) Date: Mon Jun 7 17:09:06 2004 Subject: Announcement: DocZilla Preview and all-new Demo Kit Message-ID: <36C81174.D918169@citec.fi> Many people have expressed interest in having a simple way of seeing XML working in a Web browser. CITEC has made its XML, SGML, and HTML browser, DocZilla Preview and all-new Demo Kit, available at http://www.doczilla.com. We've put a lot of interesting material in the Demo Kit and made it accessible with point-and-click simplicity. DocZilla is based on Netscape's Mozilla open-source project. It uses CSS to render XML and SGML directly and also supports the DOM accessed through JavaScript. Some fairly cool interactive effects with XML and the DOM have been presented in the Demo kit. DocZilla Preview is pre-alpha, it has its rough edges including not much of an interface and is eminently crashable. In the interests of somewhat better stability it is also not our latest and greatest code. Still it is cool and worth a look. CITEC will be at the XML XTECH conference in San Jose beginning March 7 to demonstrate recent progress with DocZilla and also to offer an in-depth tutorial on the Mozilla open-source code. Hope to see you! Best Regards, Miss DocZilla -- Mirja Hukari tel. +358-6-3240 723 Citec Information Technology fax. +358-6-3240 800 Silmukkatie 2 email. mhu@citec.fi 65100 Vaasa URL: http://www.citec.fi xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From keshlam at us.ibm.com Mon Feb 15 15:45:05 1999 From: keshlam at us.ibm.com (keshlam@us.ibm.com) Date: Mon Jun 7 17:09:06 2004 Subject: xml-dev Digest V1 #242 Message-ID: <85256719.00565742.00@D51MTA03.pok.ibm.com> Oren said: >IMVHO SAX should be defined not as a "parser interface" but as a "DOM tree >visitor interface". Mild disagreement, related to where DOM is positioned. The Document Object Model, despite its name, is not intended to be the Official Inner Semantics of XML. (That's the Infoset WG's problem.) DOM is just an API for accessing a document, which happens to be organized in a way that closely resembles a parse tree. If you want to say that both DOM and SAX should be considered as visitors to the parse tree, I'll buy it. And certainly parsers _can_ output a parse tree whose structure mirrors the DOM. ______________________________________ Joe Kesselman / IBM Research Unless stated otherwise, all opinions are solely those of the author. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Mon Feb 15 17:23:48 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:09:06 2004 Subject: Announcement: DocZilla Preview and all-new Demo Kit In-Reply-To: <36C81174.D918169@citec.fi> Message-ID: At 2:22 PM +0200 2/15/99, Mirja Hukari wrote: > >DocZilla is based on Netscape's Mozilla open-source project. >It uses CSS to render XML and SGML directly and also supports >the DOM accessed through JavaScript. Mozilla's been able to render XML+CSS for several months. What does Doczilla do that Mozilla doesn't? Why fork the tree here? +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML: Extensible Markup Language (IDG Books 1998) | | http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From shecter at darmstadt.gmd.de Mon Feb 15 17:44:47 1999 From: shecter at darmstadt.gmd.de (Robb Shecter) Date: Mon Jun 7 17:09:06 2004 Subject: Announcement: DocZilla Preview and all-new Demo Kit References: Message-ID: <36C85C73.493CBEB8@darmstadt.gmd.de> Elliotte Rusty Harold wrote: > At 2:22 PM +0200 2/15/99, Mirja Hukari wrote: > > > >DocZilla is based on Netscape's Mozilla open-source project. > >It uses CSS to render XML and SGML directly and also supports > >the DOM accessed through JavaScript. > > Mozilla's been able to render XML+CSS for several months. What does > Doczilla do that Mozilla doesn't? Why fork the tree here? > And does anyone know when/if Mozilla would support XML+XSL to the level that IE does? - Robb xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Feb 15 17:59:01 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:06 2004 Subject: Announcement: DocZilla Preview and all-new Demo Kit Message-ID: <3.0.32.19990215095541.01432a90@pop.intergate.bc.ca> At 06:42 PM 2/15/99 +0100, Robb Shecter wrote: >And does anyone know when/if Mozilla would support XML+XSL to the level that IE does? Just for the record - although I'm sure that 99% of the people here know this, I feel obligated to keep saying it: XSL is a future. It is some number of months from being stable. Successive drafts from the Working Group have differed dramatically from each other. Whereas I salute Microsoft for their progressive attitude and for pulling together implementations of a moving target, people who want to build real production solutions should be VERY VERY CAUTIOUS at this point in history about relying upon XSL, and accept the fact that you are probably going to have to migrate all your code to track changes in the spec. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From shecter at darmstadt.gmd.de Mon Feb 15 18:13:56 1999 From: shecter at darmstadt.gmd.de (Robb Shecter) Date: Mon Jun 7 17:09:06 2004 Subject: Valid RDF and security Message-ID: <36C8639E.C6B36E7C@darmstadt.gmd.de> Hi, A week or so ago, someone asked how a piece of RDF can be validated, analogous to the way that a piece of XML can be validated with a DTD. I don't think anybody answered this, or I missed the answer. (?) I'm new to RDF, and don't know the answer, because as I understand it, validating RDF would mean making sure that the document properly follows (say) Dublin Core, and DC is defined as a schema, not as a DTD. (?) I have concrete application, and so this isn't just an esoteric question to me: I'm working on an OO framework for gathering metadata from various websites, and presenting it in a nice way for browsing. Most sites have their own home-made metadata format (see http://slashdot.org/ultramode.txt ). OO design makes my task easy: I use the Adapter pattern. I write an adapter for each site that converts its file format into an object model that then gets digested by the rest of my framework. Now, what if I want to make this scalable by shifting the burden of writing these adapters to the site administrators themselves? Idea 1: Have webadmins write Java adapter classes that my framework would dynamically load via http. This sounds cool, is possible (servlets do this), but has a security risk: These webadmins at external sites are untrusted. If I load and link their code on the fly, it could be programmed to do any number of destructive things on my server. Idea 2: Specify that webadmins must make an XML document available via http. The format would be simple like: ...I also write a DTD for this, and make it publicly available. Then, I write one adapter for my framework that parses this XML, and throws an exception if it doesn't match the DTD. And here, I see an advantage to the fact that XML is like objects without behavior. Because, that makes it secure. An XML document can't damage my server, and with a DTD I can guarantee that the 3rd party has conformed to my interface. The DTD in fact has taken the place of the Java "interface" that the 3rd party web admins would have had to implement. Idea 3: Well, after reading about RDF, it seems like I'm reinventing the wheel a bit. RDF is designed to do just what I was thinking about in 2. But, how do I validate it? And in my application, I really need the validation, because the validation enforces program functioning and security. Thanks for any comments, - Robb xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mlelist at citec.fi Mon Feb 15 18:16:05 1999 From: mlelist at citec.fi (Michael Leventhal) Date: Mon Jun 7 17:09:06 2004 Subject: Announcement: DocZilla Preview and all-new Demo Kit References: Message-ID: <36C86367.7930C7E5@citec.fi> Elliotte Rusty Harold wrote: > At 2:22 PM +0200 2/15/99, Mirja Hukari wrote: > > > >DocZilla is based on Netscape's Mozilla open-source project. > >It uses CSS to render XML and SGML directly and also supports > >the DOM accessed through JavaScript. > > Mozilla's been able to render XML+CSS for several months. What does > Doczilla do that Mozilla doesn't? Why fork the tree here? I'm one of DocZilla developers. It is true that Mozilla has been able to render XML+CSS for several months, after a fashion, but it is also true that: 1. Mozilla is more-or-less in a permanent state of being broken so very many people have not been able to get it to work to actually see XML+CSS. 2. Display of XML documents is not a priority in the main line of Mozilla development, at least not up until now. As far as I know DocZilla is the only Mozilla project focusing on XML (and SGML). We therefore have the comprehensive body of material to show what XML, CSS and the DOM can do. Everywhere else in the XML world this seems often to still be rather poorly understood. So this is "show me" stuff which is not available with straight Mozilla at the moment. There are a dozen or 15 basically point-and-click demos nicely packaged. But to actually answer your questions ... The second one first - we are not forking the tree. Mozilla has a COM-based component architecture which enables us to add stuff which works with the mainline, not fork. We are using all the mainline and adding stuff that will not go into Navigator 5 and some stuff that will never go into any Navigator. Not all of it is in the Preview yet, actually it is pretty old code now because Mozilla is in such a tumulutous state right now, but the value-added-to-Mozilla stuff includes or will include: SGML, structured search, handling of very large documents through our own fragment-capable parser, dynamic generation of hypertext TOCs, CALS tables, HyTime links, TEI extended pointers, CGM, many other graphics formats, application packages (scripts, stylesheets) for specific DTDs and types of applications (IETM, Help, for two examples). OK, is it enough?! Cheers, Michael Leventhal Architecture/Development CITEC xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Mon Feb 15 19:43:43 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:09:06 2004 Subject: Valid RDF and security References: <36C8639E.C6B36E7C@darmstadt.gmd.de> Message-ID: <36C8728F.3089D390@prescod.net> Perhaps this will help: http://www.lists.ic.ac.uk/hypermail/xml-dev/9902/0371.html Robb Shecter wrote: > > Hi, > > A week or so ago, someone asked how a piece of RDF can be validated, > analogous to the way that a piece of XML can be validated with a DTD. I > don't think anybody answered this, or I missed the answer. (?) I'm new > to RDF, and don't know the answer, because as I understand it, > validating RDF would mean making sure that the document properly follows > (say) Dublin Core, and DC is defined as a schema, not as a DTD. (?) Well, there is an initial Dublin Core schema, but I don't know if there is any standard one: http://www.w3.org/TR/WD-rdf-schema/#dublincore > Idea 1: Have webadmins write Java adapter classes that my framework > would dynamically load via http. This sounds cool, is possible > (servlets do this), but has a security risk: These webadmins at > external sites are untrusted. If I load and link their code on the fly, > it could be programmed to do any number of destructive things on my > server. Declarative specifications are almost always better for this and many other reasons. > Idea 3: Well, after reading about RDF, it seems like I'm reinventing the > wheel a bit. RDF is designed to do just what I was thinking about in > 2. But, how do I validate it? And in my application, I really need the > validation, because the validation enforces program functioning and > security. Define your document type with a DTD. Make the DTD such that every document that conforms to it automatically conforms to RDF. Then you're home free. If you don't need the degrees of syntactic freedom offered by an RDF schema, then just use a DTD. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco If you spend any time administering Windows NT, you're far too familiar with the Blue Screen of Death (BSOD) which displays the cause of the crash and gives some information about the state of the system when it crashed. -- "Microsoft Developer Network Magazine" xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Mon Feb 15 20:16:22 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:09:07 2004 Subject: Roll-Your-Own Parsers Message-ID: <87256719.006F3565.00@d53mta03h.boulder.ibm.com> >>We have taken that approach with our 'Version 2" parsers, Java and C++. >>They are pretty well layered and pluggable. Don't plug in a validation >>handler and you won't do any validation work. Don't plug in an entity >>handler, and you won't get any entity information, etc... Basically we've >>just extended the concept of a SAX-like handler all the way into the core >>of the parser. It allows both for extensibility by rolling your own >>handler, and for the client who is putting together a particular type of >>parser configuration to tell the lowest level of the parser "do the least >>work possible for this group of things, since I'm not even interested". > >This sounds (and looks) promising. I'm not clear exactly _how_ modular it >is, though. Can I take info from the SAX parser, abuse (or nicely process) >it, and feed it back into the DOM tree builder? Or am I stuck to choosing >validating/non-validating and DOM/SAX? Is the 'SAX-like handler' really >SAX with extras, or is it incompatible? > A 'parser' in our system is really just a small amount of code which wires the events coming from our internal APIs to any kind of standard outgoing API you want to support. For a 'SAX Parser' its pretty much a one to one mapping of an internal event to an external event, throwing away some information that cannot be passed through the SAX API. You can also choose to write your program in terms of our internal event API, if you want to have full access to the maximum amount of information. So, you could write a 'mutating SAX parser' that takes the internal events, passes them through a 'look aside' plug in object which mutates them, then pass it on out the SAX interface to client code. The possible scenarios are pretty endless I think. But the basic deal is that a 'parser' is not something that we define, its an open ended configuration of a scanner, whatever internal handlers you want to install on it, and any outgoing APIs that you want to then spit the data out through (massaged in any way you want.) The internal APIs have more info than can be passed out SAX. The C++ version can actually, using the internal APIs, spit back out the original file almost character for character (after required entity substitution anyway.) It can also spit back out the internal and external subsets very close to the original. You can tell the scanner whether you want 'advanced callbacks' which will cause it to tell you about whitespace everywhere (not just in the content) and it will tell you about markup decls that it parsed but isn't going to use because they are overrides of previously declared decls, etc... One of the powerful features is that allows you to create and plug in a custom validator. We ship a default validator which does DTD like things. However, if you want to, you can create a DCD validator or an XSchema validator and plug it in as well. The scanner will maintain all of the info that is required to do DTD like validation (and make it available to the validator) and the validator can just any extra information that it needs to do the extra work (such as type info and whatnot.) >Reading the API is kind of weird. I'd like to know what this 'scanner' >critter is doing too. > The scanner is the core of system. It handles the actual scanning of XML text, the generation of internal events, basic w/f and validation checking, etc... You install handlers on it (the handlers for the internal events), which it calls. It also maintains an element decl pool, attribute decl pool, entity decl pools, etc... And it does most of the standard well formedness and validation checks (with validators only handling any extended checking, and handling structural validation.) The C++ version of the scanner presents the same basic internal interface as the Java one, though its internal implementation is much different. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Feb 15 22:55:14 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:07 2004 Subject: ModSAX (SAX 1.1) Proposal Message-ID: <14024.41749.192627.145011@localhost.localdomain> Love it or hate it, here goes... I propose that SAX 1.1 (aka ModSAX) add the following three core classes and interfaces to SAX 1.0 in the org.xml.sax package: public interface ModParser extends Parser { public abstract void setFeature (String featureID, boolean state) throws SAXNotSupportedException; public abstract void setHandler (String handlerID, ModHandler handler) throws SAXNotSupportedException; } public interface ModHandler { // yup, it's empty... } public class SAXNotSupportedException extends SAXException { // yup, it's empty... } I'm going to disappear now until Wednesday, so I'll let everyone chew on these for a while until I have time to offer explanations and arguments later this week. No, I haven't forgotten namespaces, or DTD information, or lexical information, or anything else. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jarle.stabell at dokpro.uio.no Mon Feb 15 23:20:59 1999 From: jarle.stabell at dokpro.uio.no (Jarle Stabell) Date: Mon Jun 7 17:09:07 2004 Subject: ModSAX (SAX 1.1) Proposal Message-ID: <01BE5942.FC221C00.jarle.stabell@dokpro.uio.no> David Megginson wrote: > public interface ModParser extends Parser > { > public abstract void setFeature (String featureID, boolean state) > throws SAXNotSupportedException; > > public abstract void setHandler (String handlerID, ModHandler handler) > throws SAXNotSupportedException; > } Is it obvious that one should throw exceptions on not supported? I believe it in many cases would be more comfortable if it returned a boolean supported or not instead of throwing an exception, as many clients probably want/need to check whether a particular feature is supported or not (and trade off behaviour accordingly) Cheers, Jarle xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Mon Feb 15 23:41:05 1999 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:09:07 2004 Subject: X3D References: <14024.41749.192627.145011@localhost.localdomain> Message-ID: <36C8AFF3.7D44@hiwaay.net> In case anyone here missed it, the Web3D consortium put out a press release on the development of X3D, previously informally referred to as VRML NG. The plans are to have a draft spec ready by mid-year for this XML component application. I am at home so I don't have the URL. :-( IETMers: heads up! Wall-to-wall. Finally. ;-) len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Feb 15 23:52:29 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:07 2004 Subject: X3D Message-ID: <3.0.32.19990215155153.01444c40@pop.intergate.bc.ca> At 05:38 PM 2/15/99 -0600, len bullard wrote: >In case anyone here missed it, the Web3D consortium >put out a press release on the development >of X3D, previously informally referred to as VRML NG. >The plans are to have a draft spec ready by mid-year >for this XML component application. > >I am at home so I don't have the URL. :-( It's at http://www.web3d.org/ -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Tue Feb 16 00:09:52 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:07 2004 Subject: ModSAX (SAX 1.1) Proposal Message-ID: <001801be5940$7617c1c0$2ee044c6@arcot-main> David, Here is my counter proposal: public interface ModularParser extends Parser { boolean hasModule(String moduleName); // returns true if the named module is supported, false otherwise. // return value does not indicate whether the module is enabled or not. // module version should be encoded in the module name boolean getModuleState(String moduleName); // returns true if the named module is enabled, false otherwise. boolean setModuleState(String moduleName, boolean enable); // enables or disables a named module. // Result reflects the resulting state of the module. ModuleHandler getModuleHandler(String moduleName); // returns named module's handler or null if there is no previously installed // handler or if the module is not supported. boolean setModuleHandler(String moduleName, ModuleHandler handler); // sets named module's handler. // returns true if successful or false if failed to set the handler due to // lack of support or other reasons. // following two might be more controversial but they are definitely useful. Object getModuleProperty(String moduleName, String propName); boolean setModuleProperty(String moduleName, String propName, Object propValue); } public interface ModuleHandler { // still empty } Differences are: 1. Does not use exception 2. Combined features with handler (liason?) types 3. Names changed for clarity and coherency This is a 'grab bag' proposal so please take any idea you find useful and throw away the rest. Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Tue Feb 16 03:40:46 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:09:07 2004 Subject: ModSAX (SAX 1.1) Proposal References: <001801be5940$7617c1c0$2ee044c6@arcot-main> Message-ID: <36C8E7E1.C889380C@manhattanproject.com> I think it would be better to start with Requirements, then some Analysis instead of jumping straight to Design. So that I'm not stopping the show, here are nice things that I'd like to see: 1. A mechanism to include multiple observers in parallel. If this dosn't already work. 2. A way to register said observers for particular events, e.g., an "event mask". 3. Complete capababilities management. 3.1. I'd rather not ask the SAX driver if it supports events X,Y,Z,P,D,Q. I'd like a more "package" oriented mechanism. 3.2 Versioning of these packages and events. 4. Event stack support. 5. Event re-write engine. 5. An on-line web site database publishing all custom event generators from various vendors. 6. Complete DOM integration?? *smirk* etc... Some of these are a bit far-out, but it's hard to know what the "specific goals" of SAX 1.1 are. Thus, some type of requirement document followed by perhaps some use cases would be illustrative. Sorry to be a pain, Clark Evans xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Tue Feb 16 06:08:43 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:09:07 2004 Subject: Valid RDF and security Message-ID: <003a01be5973$0bb5fda0$50f96d8c@NT.JELLIFFE.COM.AU> From: Robb Shecter >A week or so ago, someone asked how a piece of RDF can be validated, >analogous to the way that a piece of XML can be validated with a DTD. I >don't think anybody answered this, or I missed the answer. (?) I'm new >to RDF, and don't know the answer, because as I understand it, >validating RDF would mean making sure that the document properly follows >(say) Dublin Core, and DC is defined as a schema, not as a DTD. (?) My impression is that the simple Dublin Core people are primarily interested in making HTML manageable for information discovery. The qualified Dublin Core people are more concerned with using the simple Dublin Core as classes; all the QDC people I have talked to (from ECAI, etc) are more concerned with making a schema for centralized databases, and relatively uninterested in "serialization" issues. This is why the Dublin Core does not have any concrete element sets; sooner or later they will want to distribute the processing, and presumably then they might get around to it. I noticed that HTML 4 has some definite markup for allowing different profiles on the metatags, to support Dublin Core. But the examples of Dublin Core in HTML that the practitioners of it use don't seem to conform. And I have seen examples given of "how to use DC in HTML 3 and HTML4" which give different methods. I don't think there is much awareness that rigorous markup involves more than just defining a field. As far as validating RDF, you might be interested in a little note I wrote "Using XSL to Validate Structured Documents" http://www.ascc.net/xml/en/utf-8resource_index.html You can use XSL to write a very simple validation program that gives nice error messages and works with wrappers or islands. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Tue Feb 16 09:07:13 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:09:07 2004 Subject: ModSAX (SAX 1.1) Proposal References: <001801be5940$7617c1c0$2ee044c6@arcot-main> <36C8E7E1.C889380C@manhattanproject.com> Message-ID: <36C934EA.8A700440@infinet.com> Clark Evans wrote: > I think it would be better to start with Requirements, > then some Analysis instead of jumping straight to Design. > > So that I'm not stopping the show, here are > nice things that I'd like to see: > > 1. A mechanism to include multiple observers > in parallel. If this dosn't already work. I think in Michael Kay's SAXON package he has some multicast DocumentHandler object that does just this. All it does is delegate processing to a list of DocumentHandlers. > 2. A way to register said observers for particular > events, e.g., an "event mask". Since SAX is based upon a callback API rather than some abstract event type, I am not so sure this is at all necessary unless things move the way Bill la Forge suggested with using event objects. I am particularly not in favor of the event objects approach because it is an extra layer of indirection in handling the parsed data that I don't think adds anything positive to SAX. > 3. Complete capababilities management. > > 3.1. I'd rather not ask the SAX driver if it > supports events X,Y,Z,P,D,Q. I'd like > a more "package" oriented mechanism. What do you mean by package oriented. If you talking about having special possibly inherited DocumentHandlers for each type of add-on, then I am all for it. This way you could have a NamespaceDocumentHanlder which would present namespace handled content to the application in an appropriate form. Changing the ParserFactory class to add additional static methods for location a NamespaceParser or an ExpandEntityParser or whatever would be totally backwards compatible. Of course the obvious problem with this is the interface explosion problem which would be a direct contradication to SAX's goal of being simple. > 3.2 Versioning of these packages and events. > > 4. Event stack support. I have done something like this before and it is particularly useful, especially for error-reporting. Of course the big problem with this is that you need to create new event objects to put on this stack. This would slow down many of the fastest XML parsers five fold. > Some of these are a bit far-out, but it's hard > to know what the "specific goals" of SAX 1.1 are. Probably roughly the same, or else it would need to change its name, "Simple API for XML" (-: Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Tue Feb 16 10:07:35 1999 From: Michael.Kay at icl.com (Michael.Kay@icl.com) Date: Mon Jun 7 17:09:07 2004 Subject: ANNOUNCING SAXON 4.0 Message-ID: <93CB64052F94D211BC5D0010A80013310EB2EC@wwmessd3.bra01.icl.co.uk> SAXON is a Java library for processing XML documents: it provides a number of services above the SAX and DOM level to make applications easier to write and more modular. It is available as a free download with source code included. SAXON 4.0 is available on http://home.iclweb.com/icl2/mhkay/saxon.html There are substantial changes in this release, notably: * Improved support for processing using the DOM, in a way that is forwards compatible with serial (SAX-based) applications: you can use the same element handlers in both modes. (This is perhaps of particular interest in the light of current xml-dev "SAX-vs-DOM" discussions). The processing model (selecting an element handler based on a pattern match) is identical to that for XSL. * Support for Stylesheets. You can now invoke many of SAXON's capabilities without writing any Java code. SAXON Stylesheets support a useful subset of XSL and provide two important additional features: the ability to create multiple output files, and the ability to freely mix XSL and Java code: XSL can be used to process some elements, and Java for others, or you can preprocess the element in Java before rendering it in XSL. Very useful if you are doing more than simple rendering, e.g. if you are loading a relational database. (To implement some of these features I have had to make incompatible changes: existing users please read the "changes" file carefully.) The ability to create multiple output files is particularly attractive for bulk rendering into HTML, and I'm not aware of any other tool that does it. A sample XSL stylesheet is included for rendering Jon Bosak's version of the New Testament: it is only 200 lines long, but produces a set of 292 linked HTML files in a single directory. This was previously published as a Java application: the SAXON download includes both Java and XSL versions for comparison. You can see the result of the rendition at http://www.wokchorsoc.freeserve.co.uk/bible-nt/index.html. This feature seems to fit very neatly into the XSL architecture and I commend it to the authors of the spec. I am still working on the next important innovation, stylesheets that can be processed in a single serial pass of the source XML document: progress looks promising - though as always, I do this stuff in the gaps between revenue-earning projects. Michael Kay Michael.Kay@icl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Tue Feb 16 10:37:28 1999 From: Michael.Kay at icl.com (Michael.Kay@icl.com) Date: Mon Jun 7 17:09:07 2004 Subject: DOM vs. SAX??? Nah. (was RE: Storing Lots of Fiddly Bits (wa s Re: What is XML for?) Message-ID: <93CB64052F94D211BC5D0010A80013310EB2ED@wwmessd3.bra01.icl.co.uk> > -----Original Message----- > From: Bill la Forge [mailto:b.laforge@jxml.com] > > What I'm more inclined to do in this SAX/DOM marriage, > is to create an alternative to the SAX helper class, > AttributeListImpl, which also implements > public Attr setAttribute(Attr newAttr); > and > public Attr removeAttribute(String name); > and > public Attr getAttribute(String name); > Unifying the DOM and SAX API's for attributes is a laudable aim but it's a little bit more complicated than this. 1 - the DOM interfaces that return Attr are called setAttributeNode(), getAttributeNode(), removeAttributeNode(). 2 - Attr is a subclass of Node and as such is required to implement all sorts of functionality that seems inappropriate in SAX, e.g. the ability to locate the owning Document and the ability to retrieve the contents of the attribute with entities unexpanded. 3 - The DOM methods setAttribute() etc work in terms of Strings, which is closer to the SAX model, but has some irritating differences: for example getAttribute() in the DOM returns the same value for an absent attribute and a zero-length attribute. And of course, they throw different exceptions. In SAXON 4.0 I use the SAX AttributeList interface, but when processing using the DOM, I use an implementation of SAX AttributeList which is actually a wrapper to the DOM Element. Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jayadeva at lgsi.co.in Tue Feb 16 10:54:37 1999 From: jayadeva at lgsi.co.in (Jayadeva Babu Gali) Date: Mon Jun 7 17:09:07 2004 Subject: servlet/xml/xsl Message-ID: <36C94E8A.4955E22F@lgsi.co.in> hi, Again i've got a problem. I am able to create the XML document dynamically from the values at Existing data base and trying to displaying the documents using MSIE5(beta). I am inserting the XSL as well as DTD file in the HttpServletRresponse file using Servlet but MSIE5.0 is not displaying the XML file . It is giving error like it is not able to access the XSL file. My code is like this and all the files in the same directory. .... out.println(""); out.println(""); ...... i am able to access the xml file separately but the browser through servlet is not accessing. regds....jayadev xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Tue Feb 16 12:11:39 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:09:07 2004 Subject: xml-dev Digest V1 #242 Message-ID: <00ac01be59a4$af7f0b80$5402a8c0@oren.capella.co.il> keshlam@us.ibm.com wrote: >Oren said: >>IMVHO SAX should be defined not as a "parser interface" but as a "DOM tree >>visitor interface". > >Mild disagreement, related to where DOM is positioned. The Document Object >Model, despite its name, is not intended to be the Official Inner Semantics >of XML. (That's the Infoset WG's problem.) DOM is just an API for accessing >a document, which happens to be organized in a way that closely resembles a >parse tree. ? It isn't clear to me what the difference between "inner semantics" and "accessing" is, but I'm willing to learn :-) I guess it is time to read some more W3 specs... >If you want to say that both DOM and SAX should be considered as visitors >to the parse tree, I'll buy it. And certainly parsers _can_ output a parse >tree whose structure mirrors the DOM. Mild disagreement :-) DOM is a random access API, not a visitor pattern API. But I'm perfectly satisfied accepting both as being wrappers for some other "inner semantics" representation - which I suspect is what happens in most implementations anyway. Share & Enjoy, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jayadeva at lgsi.co.in Tue Feb 16 12:26:26 1999 From: jayadeva at lgsi.co.in (Jayadeva Babu Gali) Date: Mon Jun 7 17:09:07 2004 Subject: attaching files Message-ID: <36C9641F.6589ACB@lgsi.co.in> hi all, i am attaching two file ( GetInfo.java - servlet and xsl - file ) when i run the servlet from MSIE5(beta). Its giving error like . both files in the same directory in my file system (c:/cnb/shop.xsl and c:/cnb/GetInfo.java) Access is denied. Line 3, Position 1 ^ expecting urs help... regds.. jayadev -------------- next part -------------- A non-text attachment was scrubbed... Name: GetInfo.java Type: application/x-unknown-content-type-java_auto_file Size: 4217 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990216/030b8f3c/GetInfo.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: shop.xsl Type: text/xml Size: 679 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990216/030b8f3c/shop.xml From simonstl at simonstl.com Tue Feb 16 17:28:10 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:08 2004 Subject: XML Schema Requirements out Message-ID: <199902161726.MAA13862@hesketh.net> See http://www.w3.org/TR/1999/NOTE-xml-schema-req-19990215. Comments go to www-xml-schema-comments@w3.org. I suspect we'll all find things to love and hate in it. Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jasr at im.se Tue Feb 16 20:23:08 1999 From: jasr at im.se (Serrat Jaime - jasr) Date: Mon Jun 7 17:09:08 2004 Subject: FW: Namespaces Message-ID: <389DA7CB46CFD111A0D100600836AD65E66AB5@msxmar1> James Clark wrote: > I have been disturbed by the amount of confusion surrounding the XML > Namespaces Recommendation. So I have written a document > > http://www.jclark.com/xml/xmlns.htm > ..but I'm afraid the link doesn't work. I'd be very interested in ready your doc. > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jdearce at tiaa-cref.org Tue Feb 16 23:16:15 1999 From: jdearce at tiaa-cref.org (Joseph DeArce) Date: Mon Jun 7 17:09:08 2004 Subject: No subject given Message-ID: <9902169192.AA919206889@balsa.tiaa-cref.org> unsubscribe xml-dev xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Feb 17 04:24:55 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:08 2004 Subject: XML 1.0 Errata, one year later Message-ID: <3.0.32.19990216202426.00a2f3b0@pop.intergate.bc.ca> One year and one week after the release of XML 1.0, we have finally updated the errata document. Go to http://www.w3.org/TR/REC-xml and follow the link in the "Status of this Document" section. This is a moving target; we have approximately 50 reported errata, and the WG whose responsibility this is has plowed through maybe half of them, accepting all but a couple as real; so this document will grow in the near future - I'll announce that too. Thanks are in order to the *very* many people who combed through the XML spec and identified these problems - I suspect that nearly every such person reads this mailing list. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Feb 17 14:13:35 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:08 2004 Subject: ModSAX (SAX 1.1) Proposal In-Reply-To: <01BE5942.FC221C00.jarle.stabell@dokpro.uio.no> References: <01BE5942.FC221C00.jarle.stabell@dokpro.uio.no> Message-ID: <14026.52684.702749.764547@localhost.localdomain> Jarle Stabell writes: > Is it obvious that one should throw exceptions on not supported? I > believe it in many cases would be more comfortable if it returned a > boolean supported or not instead of throwing an exception, as many > clients probably want/need to check whether a particular feature is > supported or not (and trade off behaviour accordingly) As I just mentioned in a reply to Don, exceptions provide for cleaner code and they help avoid bugs by enabling more compile-time checking. It's easy to write parser.setFeature("org.xml.sax.features.namespaces", true); and forget to check the return value, but it's harder to forget to catch an exception. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Feb 17 14:17:19 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:08 2004 Subject: ModSAX (SAX 1.1) Proposal In-Reply-To: <001801be5940$7617c1c0$2ee044c6@arcot-main> References: <001801be5940$7617c1c0$2ee044c6@arcot-main> Message-ID: <14026.51106.26742.482359@localhost.localdomain> Don Park writes: > Here is my counter proposal: Thank you very much for the input, and for the opportunity to explain my thinking a little more. > public interface ModularParser extends Parser { > boolean hasModule(String moduleName); > // returns true if the named module is supported, false otherwise. > // return value does not indicate whether the module is enabled or > not. > // module version should be encoded in the module name What exactly do you mean by a "module"? > boolean getModuleState(String moduleName); > // returns true if the named module is enabled, false otherwise. > > boolean setModuleState(String moduleName, boolean enable); > // enables or disables a named module. > // Result reflects the resulting state of the module. I deliberately avoided the first one, because I didn't want parsers to be forced to have a determinate state. For example, it is possible for parser.setFeature("org.xml.sax.features.validation", true) and parser.setFeature("org.xml.sax.features.validation", false) both to fail. This is exactly what will happen in my helper class for adapting SAX 1.0 Parsers. The default value will always be "don't know, don't care", since there will always be features that drivers don't know about and about which they cannot give a correct true/false answer. > ModuleHandler getModuleHandler(String moduleName); > // returns named module's handler or null if there is no previously > installed > // handler or if the module is not supported. In general, we don't have get* methods for org.xml.sax.Parser -- is there a special reason that we need one here? > boolean setModuleHandler(String moduleName, ModuleHandler handler); > // sets named module's handler. > // returns true if successful or false if failed to set the handler > due to > // lack of support or other reasons. I prefer to throw an exception rather than returning a boolean, but that's very much a personal prejudice. My rationale is that it's harder to forget about an exception than it is to ignore a boolean value; a programmer in a hurry can easily write setModuleHandler("org.xml.sax.handlers.namespace", handler) and forget to check the return value, but if the programmer forgets to catch an exception, the compiler will likely complain loudly (unless the current method already throws that exception or its supertype). > // following two might be more controversial but they are definitely > useful. > > Object getModuleProperty(String moduleName, String propName); > boolean setModuleProperty(String moduleName, String propName, Object > propValue); > } In the case of handlers, it makes more sense to set properties directly on the handler objects; in the case of features (like namespaces), I agree that there does need to be a way to set properties to values other than true or false. How about this: public interface ModParser extends Parser { public abstract void setFeature (String featureID, boolean state) throws SAXNotSupportedException; public abstract void setParameter (String parameterID, Object parameter) throws SAXNotSupportedException; public abstract void setHandler (String handlerID, ModHandler handler) throws SAXNotSupportedException; } Setting up a parser for namespace processing might look something like this: try { parser.setFeature("org.xml.sax.features.namespaces", true); parser.setParameter("org.xml.sax.parameters.namespaces.separator", " "); parser.setHandler("org.xml.sax.handlers.namespace", handler); } catch (SAXNotSupportedException e) { System.err.println("Parser does not support namespace handling."); } It certainly looks cleaner than checking a lot of boolean return values, and it provides stronger compile-time checking as well. Thanks, and all the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From k.bhattacharya at bbc.co.uk Wed Feb 17 14:20:56 1999 From: k.bhattacharya at bbc.co.uk (Kaustav Bhattacharya) Date: Mon Jun 7 17:09:08 2004 Subject: XML conferences in the UK Message-ID: Been searching in vain on Usenet for XML groups, I doubt my company usenet news feed carries any. I read on the web about many wonderful XML conferences and one-day symposiums in the USA and Canada. Great! Except I'm in the UK. If anyone is aware of any XML conferences going on in the UK, I'd be obliged if you could direct me to the appropriate web site for details of venues, dates and topics. Regards, Kaustav Bhattacharya - k.bhattacharya@bbc.co.uk Web Producer - Technology Services and Support British Broadcasting Corporation (BBC) Woodlands, Room C118, Tel: 0181 225 9765 Int. Tel. Ext: 59765 - Mobile: 0780 308 0499 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From PrashanthL at inf.com Wed Feb 17 14:26:46 1999 From: PrashanthL at inf.com (Prashanth Lakshmi Narayanan) Date: Mon Jun 7 17:09:08 2004 Subject: from a new member Message-ID: <8EE756E49A17D21194860008C7F49AFE0117DD1F@TWRMSG01> hello to everyone on this list, iam a new member to the list and i hope to gain from the xml-rich discussions that are posted here. i have a request to the more experienced members. i work for a software company called infosys technologies ltd, at bangalore and iam responsible for learning the xml technology and implementing it to show it's powers. the problem definition has been left to me and iam to state the problem (it should be solvable in 2 weeks time and could be anything from a simple application that simulates a telephone directory to any complicated one) and solve it too. i have installed xml4j (the parser developed by ibm) and being fairly experienced with java, i have understood how the parser works. i wish some of you could give me pointers as to what sort of application i should choose to develop (keeping the time constraint of 2 weeks in mind) so that i can bring out the power of xml. thanks in advance. prashanth. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Feb 17 14:28:36 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:08 2004 Subject: Why SAX doesn't have a requirements document (was Re: ModSAX (SAX 1.1) Proposal) In-Reply-To: <36C8E7E1.C889380C@manhattanproject.com> References: <001801be5940$7617c1c0$2ee044c6@arcot-main> <36C8E7E1.C889380C@manhattanproject.com> Message-ID: <14026.52794.973319.41053@localhost.localdomain> Clark Evans writes: > I think it would be better to start with Requirements, then some > Analysis instead of jumping straight to Design. [snip] > Some of these are a bit far-out, but it's hard to know what the > "specific goals" of SAX 1.1 are. They are mostly the same as the goals of SAX 1.1: 1. Keep the interface absolutely minimalist -- anything that can be implemented on top of SAX rather than inside it doesn't belong in the core interface. 2. Avoid dictating design decisions to parser writers to the greatest degree possible. To these, I would add #3 for version 1.1: 3. Provide a standard, dependable mechanism for people to supply extensions. > Thus, some type of requirement document followed by perhaps some > use cases would be illustrative. This is a very wise suggestion, but I will decline to take it up -- the real reason for requirements documents, design documents, etc. is to provide a paper trail for protection against political in-fighting, and we're all much too partical and friendly on XML-Dev to need that sort of thing. In practice too much overhead simply distracts from rapid development and deployment. There are tens of thousands of pages of unused ISO and even W3C specs gathering cyberdust despite the fact that incredibly large numbers person-days went into designing requirements, use cases, etc. -- all the process did was help to keep people busy, but none of it helped ensure guaranteed (in fact, the process almost always made things worse, because too many hypothetical requirements and use cases tend to bloat the specs to the point that they're not worth implementing). SAX 1.0 succeeded because we kept it small and finished it quickly (and thus, hit a critical window in the market) and because I shipped it with working code (SAX drivers for the four XML parsers). This is a design phenomenon well-known to all of us on a much larger scale with the success of HTML -- sure, HTML compatibility in browsers is a bit of a mess, but if Tim B-L had gone through a formal collaborative design process ten years ago, HTML probably wouldn't exist at all today (except possibly as yet another historical footnote in HyperText papers). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From robin at isogen.com Wed Feb 17 14:43:03 1999 From: robin at isogen.com (Robin Cover) Date: Mon Jun 7 17:09:08 2004 Subject: XML conferences in the UK In-Reply-To: Message-ID: Research Dissemination Workshop: Markup Technologies for Computational Linguistics Hosted by the HCRC Language Technology Group University of Edinburgh 25, 26 February 1999 See: http://www.ltg.ed.ac.uk/nscope/ -------------------------------------------------------------------- On Wed, 17 Feb 1999, Kaustav Bhattacharya wrote: > Been searching in vain on Usenet for XML groups, I doubt my company usenet > news feed carries any. I read on the web about many wonderful XML > conferences and one-day symposiums in the USA and Canada. Great! Except I'm > in the UK. If anyone is aware of any XML conferences going on in the UK, > I'd be obliged if you could direct me to the appropriate web site for > details of venues, dates and topics. > > Regards, > > Kaustav Bhattacharya - k.bhattacharya@bbc.co.uk > Web Producer - Technology Services and Support > British Broadcasting Corporation (BBC) > Woodlands, Room C118, Tel: 0181 225 9765 > Int. Tel. Ext: 59765 - Mobile: 0780 308 0499 > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tyler at infinet.com Wed Feb 17 14:58:54 1999 From: tyler at infinet.com (Tyler Baker) Date: Mon Jun 7 17:09:08 2004 Subject: ModSAX (SAX 1.1) Proposal References: <01BE5942.FC221C00.jarle.stabell@dokpro.uio.no> <14026.52684.702749.764547@localhost.localdomain> Message-ID: <36CAD8D8.182E527@infinet.com> David Megginson wrote: > Jarle Stabell writes: > > > Is it obvious that one should throw exceptions on not supported? I > > believe it in many cases would be more comfortable if it returned a > > boolean supported or not instead of throwing an exception, as many > > clients probably want/need to check whether a particular feature is > > supported or not (and trade off behaviour accordingly) > > As I just mentioned in a reply to Don, exceptions provide for cleaner > code and they help avoid bugs by enabling more compile-time checking. > It's easy to write > > parser.setFeature("org.xml.sax.features.namespaces", true); > > and forget to check the return value, but it's harder to forget to > catch an exception. What David is proposing I think makes sense for most uses of SAX. If you need to do some sort of dynamic feature querying with the parser, you can write code like this: boolean namespacesAllowed = false try { parser.setFeature("org.xml.sax.features.namespaces", true); namespacesAllowed = true; } catch (SAXNotSupportedException e) {} boolean validationAllowed = false try { parser.setFeature("org.xml.sax.features.validation", true); validationAllowed = true; } catch (SAXNotSupportedException e) {} If an exception is thrown in either of these cases, the boolean flag is not changed to true. Likewise you could do something a little more elaborate like: int featureFlags = Features.NONE; try { parser.setFeature("org.xml.sax.features.namespaces", true); featureFlags |= Features.NAMESPACES; } catch (SAXNotSupportedException e) {} try { parser.setFeature("org.xml.sax.features.validation", true); featureFlags |= Features.VALIDATION; } catch (SAXNotSupportedException e) {} I have had a lot of experience in the past where I thought that using return values to indicate error flags was a good idea for this sort of thing, but the truth is exceptions are much nicer for the reasons David stated as well as the fact that at least in the case of checked exceptions, it makes writing your code and API's more straightforward and easier to understand. Tyler xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Livinsb at rbos.co.uk Wed Feb 17 15:02:43 1999 From: Livinsb at rbos.co.uk (Livingstone, Stephen) Date: Mon Jun 7 17:09:08 2004 Subject: XML conferences in the UK Message-ID: <217258E84FF7CF11B4630001FA44B2D502CF051C@REFROWTECX1> Thanks for that. I didn't know about it and only work round the corner !! Pitty all the tutorial places are gone though. Steven Steven Livingstone BSc MSc GradInstP Corporate Systems Development (TCN) Royal Bank Of Scotland. > *: mailto:livinsb@rbos.co.uk > *: +44 0131 523 4354 [x24354] > Networking Technical Associates, Glasgow, Scotland. > *: mailto:ntw_uk@hotmail.com > *: +44 07771-957-280 > > -----Original Message----- > From: Robin Cover [SMTP:robin@isogen.com] > Sent: Wednesday, February 17, 1999 2:42 PM > To: Kaustav Bhattacharya > Cc: 'xml-dev@ic.ac.uk' > Subject: Re: XML conferences in the UK > > > *** Warning : this message originates from the Internet **** > > > Research Dissemination Workshop: > Markup Technologies for Computational Linguistics > Hosted by the HCRC Language Technology Group > University of Edinburgh > 25, 26 February 1999 > See: http://www.ltg.ed.ac.uk/nscope/ > > -------------------------------------------------------------------- > > On Wed, 17 Feb 1999, Kaustav Bhattacharya wrote: > > > Been searching in vain on Usenet for XML groups, I doubt my company > usenet > > news feed carries any. I read on the web about many wonderful XML > > conferences and one-day symposiums in the USA and Canada. Great! > Except I'm > > in the UK. If anyone is aware of any XML conferences going on in > the UK, > > I'd be obliged if you could direct me to the appropriate web site > for > > details of venues, dates and topics. > > > > Regards, > > > > Kaustav Bhattacharya - k.bhattacharya@bbc.co.uk > > Web Producer - Technology Services and Support > > British Broadcasting Corporation (BBC) > > Woodlands, Room C118, Tel: 0181 225 9765 > > Int. Tel. Ext: 59765 - Mobile: 0780 308 0499 > > > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > > (un)subscribe xml-dev > > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > > subscribe xml-dev-digest > > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > > > > > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer.. 'Internet e-mails are not necessarily secure. The Royal Bank of Scotland plc does not accept responsibility for changes made to this message after it was sent.' xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ti64877 at imcnam.sbi.com Wed Feb 17 15:27:33 1999 From: ti64877 at imcnam.sbi.com (Ingargiola, Tito) Date: Mon Jun 7 17:09:08 2004 Subject: ModSAX (SAX 1.1) Proposal Message-ID: <3994C79D0211D211A99F00805FE6DEE249BF8D@exchny15.corp.smb.com> Hello, Why is interface ModHandler empty? Presumably, (an implementation of) ModParser is going to need to call methods on its handlers as it goes about its business . Will it somehow know that for ModHandlers which implement, say, namespace processing, that it should call a particular method (I won't even attempt to suggest what that method might be :-)? In java, empty "marker" interfaces appear: 1) for pieces of "silent" functionality like Cloneable and Serializable where a standard interface simply doesn't apply, or 2) where the variety of types of applications which will need to have a formalized contract with some piece of functionality is so broad that it is prohibitvely difficult to define the appropriate method signatures for that interface; extended interfaces are expected to proliferate (e.g., EventListener & friends). It seems to me that SAX is clearly not a case of 1), and while it may be a 2), it isn't obviously so. EventListener is nice in the regard that it makes explicit that "there exists a contract between things that propagate events and things that respond to them"; it doesn't fix that contract for us, but it sets the tone of interaction. "EventSources" of essentially any kind are provided a template of how to interact in a decoupled way with those interested in what they're up to. In the case of SAX, the "EventSource" doesn't vary all that much; in some form or another we're always dealing with an XML parser. Do we need such latitude in defining the interactions between the parser and external modules? I'm afraid that ModHandler's current (suggested) definition is too open-ended -- it may encourage interoperability problems. The key virtue (in my opinion) of SAX is that it allows me to plug an arbitrary parser into the front of my XML processor and I'm nearly guaranteed that if it's IBM's or J. Clark's or whomever's, it's going to work. A marker interface inside the parser seems to me to threaten this happy state of affairs... Regards, Tito. ps - I also "vote" with D. Park that "ModularParser/Handler" are nicer names than "ModParser/Handler"... otherwise developers may be confused that ModParsers are just Parsers that like vespas, bomber jackets and The Smiths... ;-> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Wed Feb 17 15:29:06 1999 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:09:08 2004 Subject: RXP 1.0 released Message-ID: <14001.199902171527@doyle.cogsci.ed.ac.uk> I am pleased to announce release 1.0 of RXP, a validating XML parser in C. Whereas previous versions were available only for individual, research and educational use, this version is licensed under the GNU Public Licence. RXP is a product of the Language Technology Group, Human Communication Resarch Centre, University of Edinburgh. RXP can be downloaded from: ftp://ftp.cogsci.ed.ac.uk/pub/richard/rxp.tar.gz -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jarle.stabell at dokpro.uio.no Wed Feb 17 15:35:36 1999 From: jarle.stabell at dokpro.uio.no (Jarle Stabell) Date: Mon Jun 7 17:09:09 2004 Subject: Exceptions or not (Was: RE: ModSAX (SAX 1.1) Proposal) Message-ID: <01BE5A94.40B53670.jarle.stabell@dokpro.uio.no> David Megginson wrote: > As I just mentioned in a reply to Don, exceptions provide for cleaner > code Generally I agree. > and they help avoid bugs by enabling more compile-time checking. I haven't used Java much, so personally I don't know whether I find this forced declarations of exception throwing a good thing. (It seems to me quite a lot of people throws in the towel and just writes "throws exception") > It's easy to write > > parser.setFeature("org.xml.sax.features.namespaces", true); > > and forget to check the return value, but it's harder to forget to > catch an exception. True. But in this case, if the parser-client wants to trade features based upon what is supported or not, I guess you really need to inspect (and perhaps store the result) for each particular feature, whether it was supported or not (or just set it in case it is supported, and don't care if it is not). Then this part of the code becomes somewhat uglier with exceptions. If the typical case is (f.i.) : "I want Namespaces, X-Link, X-Schema and X-Pointer, and if I don't get all of these, I won't use you" (ie a conjunctive condition), then exceptions works very well, but if you don't want to throw in the towel the moment a single feature is not supported, exceptions could give uglier code than without them. Of course, this mail is only about minor details which doesn't matter much. If one needs to do a "lot of" reasoning based upon which features are supported, it is trivial to make a wrapper function which transforms the exception (or not) into a boolean value. Cheers, Jarle Stabell xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Feb 17 15:52:30 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:09 2004 Subject: ModSAX (SAX 1.1) Proposal In-Reply-To: <3994C79D0211D211A99F00805FE6DEE249BF8D@exchny15.corp.smb.com> References: <3994C79D0211D211A99F00805FE6DEE249BF8D@exchny15.corp.smb.com> Message-ID: <14026.57951.26146.946129@localhost.localdomain> Ingargiola, Tito writes: > Why is interface ModHandler empty? Presumably, (an implementation > of) ModParser is going to need to call methods on its handlers as > it goes about its business . Will it somehow know that for > ModHandlers which implement, say, namespace processing, that it > should call a particular method (I won't even attempt to suggest > what that method might be :-)? Maybe it will help if I walk through a silly example. Here's the interface: public interface PingHandler extends org.xml.sax.ModHandler { public abstract void ping (); } Here's how I register it with a ModParser: try { parser.setHandler("com.megginson.handlers.ping", pingHandler; } catch (SAXException e) { System.err.println("Parser does not support Ping handlers"); } Here's part of my PingParser class: private PingHandler pingHandler; public void setHandler (String handlerID, ModHandler handler) throws SAXNotSupportedException { if (handlerID.equals("com.megginson.handlers.ping")) { pingHandler = (PingHandler)handler; } else { throw new SAXNotSupportedException("Unknown handler type: " + handlerID); } } In other words, if the class recognises the handlerID, then it will know how to cast it; if it does not, then it should throw an exception. The ideal design pattern for this is the Chain of Responsibility pattern, where we have a series of filters between the application and the actual XML parser, all of which support the ModParser interface; if a filter does not recognise a handler type, it can pass it back to its parent, and only the root parser actually throws a SAXNotSupportedException. > In java, empty "marker" interfaces appear: > > 1) for pieces of "silent" functionality like Cloneable and > Serializable where a standard interface simply doesn't apply, or > 2) where the variety of types of applications which will > need to have a formalized contract with some piece of functionality is so > broad that it is prohibitvely difficult to define the appropriate method > signatures for that interface; extended interfaces are expected to > proliferate (e.g., EventListener & friends). > > It seems to me that SAX is clearly not a case of 1), and while it > may be a 2), it isn't obviously so. It looks to me like it hits both quite nicely. > I'm afraid that ModHandler's current (suggested) definition is > too open-ended -- it may encourage interoperability problems. The > key virtue (in my opinion) of SAX is that it allows me to plug an > arbitrary parser into the front of my XML processor and I'm nearly > guaranteed that if it's IBM's or J. Clark's or whomever's, it's > going to work. A marker interface inside the parser seems to me to > threaten this happy state of affairs... Not really -- you're guaranteed the same level of interoperability as before (because the full SAX 1.0 Parser interface is still there), but now there is a standard, layer-able method for adding new handler types and for determining whether they're supported or not. People have been expanding SAX anyone, and I just want to provide a nice, clean method for doing so. If I had something like public interface ModHandler { public abstract void event (String eventID, Object eventArgs[]) throws SAXException; } then I'd put the burdon of casting onto the application (many times) rather than the parser (once). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Feb 17 16:04:15 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:09 2004 Subject: Exceptions or not (Was: RE: ModSAX (SAX 1.1) Proposal) In-Reply-To: <01BE5A94.40B53670.jarle.stabell@dokpro.uio.no> References: <01BE5A94.40B53670.jarle.stabell@dokpro.uio.no> Message-ID: <14026.58701.412917.526788@localhost.localdomain> Jarle Stabell writes: > I haven't used Java much, so personally I don't know whether I find > this forced declarations of exception throwing a good thing. (It > seems to me quite a lot of people throws in the towel and just > writes "throws exception") Sure, it can get to that point, but then the exception falls off the top level, and you still end up with some useful information about where your bug was, so you're still better off than you would have been. Besides, "throws Exception" requires a conscious effort to turn off the safety mechanisms. > If the typical case is (f.i.) : "I want Namespaces, X-Link, X-Schema and > X-Pointer, and if I don't get all of these, I won't use you" (ie a > conjunctive condition), then exceptions works very well, but if you don't > want to throw in the towel the moment a single feature is not supported, > exceptions could give uglier code than without them. The two seem about the same length to me. Here's the version with exceptions: boolean hasNamespaces = true; boolean hasXLink = true; boolean hasDDML = true; boolean hasXPointer = true; try { parser.setFeature("com.foo.features.namespaces", true); } catch (SAXNotSupportedException e) { hasNamespaces = false; } try { parser.setFeature("com.foo.features.xlink", true); } catch (SAXNotSupportedException e) { hasXLink = false; } try { parser.setFeature("com.foo.features.ddml", true); } catch (SAXNotSupportedException e) { hasDDML = false; } try { parser.setFeature("com.foo.features.xpointer", true); } catch (SAXNotSupportedException e) { hasXPointer = false; } Here's the version with boolean tests: boolean hasNamespaces = false; boolean hasXLink = false; boolean hasDDML = false; boolean hasXPointer = false; if (parser.supportsFeature("com.foo.features.namespaces")) { parser.setFeature("com.foo.features.namespaces", true); hasNamespaces = true; } if (parser.supportsFeature("com.foo.features.xlink")) { parser.setFeature("com.foo.features.xlink", true); hasXLink = true; } if (parser.supportsFeature("com.foo.features.ddml")) { parser.setFeature("com.foo.features.ddml", true); hasDDML = true; } if (parser.supportsFeature("com.foo.features.xpointer")) { parser.setFeature("com.foo.features.xpointer", true); hasXPointer = true; } All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Wed Feb 17 16:24:45 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:09:09 2004 Subject: ModSAX (SAX 1.1) Proposal (fwd) Message-ID: ---------- Forwarded message ---------- Date: Wed, 17 Feb 1999 14:33:36 +0000 (GMT) From: Dan Brickley To: David Megginson Cc: daniel.brickley@bristol.ac.uk Subject: Re: ModSAX (SAX 1.1) Proposal David, Here's a small proposal: > public abstract void setFeature (String featureID, boolean state) becomes public abstract void setFeature (URI featureID, boolean state) (or leave as String but with understanding that the datatype of featureID is anything that's a legal Web URI) So instead of "org.xml.sax.myfeature" we'd pass in tokens such as "urn:sax-features:myfeature", "http://sax.org/features/myfeature", "uuid:434234-2342342-23423423" or whatever. Anything that is a legal URI as per the URI RFC. Whether these (ever) dereference to anything is a separate issue. I'm assuming that you'll need a URI class for namespace handling anyway (or does Java 2.0 have this alongside the Java class 'URL'?) so I don't think I'm proposing bloat. Justification: setFeature needs to be passed an identifying string that uniquely picks out some property. Rather than go with the pseudo-uri's in Java, it would be more in the spirit of XML and Web to use URIs. Using URIs is also more friendly towards non-Java implementations of the SAX API. What do you think? It buys you management of the space of featureID names, at cost of inevitable arguments about what you'll get if you de-reference the URI. If not, what should be said about legal featureID values? Dan On Mon, 15 Feb 1999, David Megginson wrote: > Love it or hate it, here goes... > > I propose that SAX 1.1 (aka ModSAX) add the following three core > classes and interfaces to SAX 1.0 in the org.xml.sax package: > > > public interface ModParser extends Parser > { > public abstract void setFeature (String featureID, boolean state) > throws SAXNotSupportedException; > > public abstract void setHandler (String handlerID, ModHandler handler) > throws SAXNotSupportedException; > } > > > public interface ModHandler > { > // yup, it's empty... > } > > > public class SAXNotSupportedException extends SAXException > { > // yup, it's empty... > } > > > I'm going to disappear now until Wednesday, so I'll let everyone chew > on these for a while until I have time to offer explanations and > arguments later this week. > > No, I haven't forgotten namespaces, or DTD information, or lexical > information, or anything else. > > > All the best, > > > David > > -- > David Megginson david@megginson.com > http://www.megginson.com/ > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jarle.stabell at dokpro.uio.no Wed Feb 17 16:25:38 1999 From: jarle.stabell at dokpro.uio.no (Jarle Stabell) Date: Mon Jun 7 17:09:09 2004 Subject: Exceptions or not (Was: RE: ModSAX (SAX 1.1) Proposal) Message-ID: <01BE5A9B.2D376B70.jarle.stabell@dokpro.uio.no> David Megginson wrote: > Jarle Stabell writes: > > > I haven't used Java much, so personally I don't know whether I find > > this forced declarations of exception throwing a good thing. (It > > seems to me quite a lot of people throws in the towel and just > > writes "throws exception") > > Sure, it can get to that point, but then the exception falls off the > top level, and you still end up with some useful information about > where your bug was, so you're still better off than you would have > been. True. I love exceptions and find that they greatly improves the robustness of applications, but the reason I'm not convinced about whether it is good to be forced to specify what will be thrown is that this in many cases seem to require psychic powers of the designer (or that the "real" exceptions must be catched and converted into an "acceptable" one, which looses information). > Besides, "throws Exception" requires a conscious effort to turn off > the safety mechanisms. Not when you are desperate or in a hurry! ;-) > boolean hasNamespaces = true; > boolean hasXLink = true; > boolean hasDDML = true; > boolean hasXPointer = true; > > try { > parser.setFeature("com.foo.features.namespaces", true); > } catch (SAXNotSupportedException e) { > hasNamespaces = false; > } > > try { > parser.setFeature("com.foo.features.xlink", true); > } catch (SAXNotSupportedException e) { > hasXLink = false; > } > > try { > parser.setFeature("com.foo.features.ddml", true); > } catch (SAXNotSupportedException e) { > hasDDML = false; > } > > try { > parser.setFeature("com.foo.features.xpointer", true); > } catch (SAXNotSupportedException e) { > hasXPointer = false; > } I meant that the SetFeature should return a boolean, not necessarily introduce/use a special supportsFeature method. Then one could use something like (haven't used Java in a while, so this may contain errors!): boolean hasNamespaces = parser.setFeature("com.foo.features.namespaces", true); boolean hasXLink = parser.setFeature("com.foo.features.xlink", true); boolean hasDDML = parser.setFeature("com.foo.features.ddml", true); boolean hasXPointer = parser.setFeature("com.foo.features.xpointer", true); and then you could analyze the results, or put some analyzis in between the above calls. Cheers, Jarle Stabell xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Wed Feb 17 16:30:07 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:09:09 2004 Subject: URIs as IDs in ModSAX (was Re: ModSAX (SAX 1.1) Proposal) (fwd) Message-ID: This and previous forwarded from an (accidentally!) off-list discussion. To recap, the proposal is that SAX feature identifiers are typed as URI (ie. superset of URLs and URNs) to give a globally managed namespace, without associating additional semantics with those names. Dan ---------- Forwarded message ---------- Date: Wed, 17 Feb 1999 09:50:22 -0500 (EST) From: David Megginson To: Dan Brickley Subject: URIs as IDs in ModSAX (was Re: ModSAX (SAX 1.1) Proposal) Dan Brickley writes: > Here's a small proposal: > > > public abstract void setFeature (String featureID, boolean state) > > becomes public abstract void setFeature (URI featureID, boolean state) > > (or leave as String but with understanding that the datatype of > featureID is anything that's a legal Web URI) Yes, using URIs was my original idea, and I'm still very partial to it; I've been using the Java-package-style names in my examples, but I'd like to know what everyone thinks about the issue. I'd prefer not the use the java.net.URL class, though -- I don't think it buys us anything, since the URIs are not mean to be dereferenced. > Justification: > setFeature needs to be passed an identifying string that > uniquely picks out some property. Rather than go with the > pseudo-uri's in Java, it would be more in the spirit of XML > and Web to use URIs. Using URIs is also more friendly towards > non-Java implementations of the SAX API. I'm strongly inclined to agree. Does anyone have a strong case against? All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tms at ansa.co.uk Wed Feb 17 16:38:25 1999 From: tms at ansa.co.uk (Toby Speight) Date: Mon Jun 7 17:09:09 2004 Subject: Exceptions or not In-Reply-To: Jarle Stabell's message of "Wed, 17 Feb 1999 16:40:38 +0100" References: <01BE5A94.40B53670.jarle.stabell@dokpro.uio.no> Message-ID: Jarle> Jarle Stabell 0> In article <01BE5A94.40B53670.jarle.stabell@dokpro.uio.no>, Jarle 0> wrote: Jarle> (It seems to me quite a lot of people throws in the towel and Jarle> just writes "throws exception") Or, equally bad: try { foo(...); } catch (Exception e) { // ignore it } xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msabin at cromwellmedia.co.uk Wed Feb 17 16:44:20 1999 From: msabin at cromwellmedia.co.uk (Miles Sabin) Date: Mon Jun 7 17:09:09 2004 Subject: Exceptions or not (Was: RE: ModSAX (SAX 1.1) Proposal) Message-ID: Jarle Stabell wrote, > True. I love exceptions and find that they greatly > improves the robustness of applications, but the > reason I'm not convinced about whether it is good > to be forced to specify what will be thrown is that > this in many cases seem to require psychic powers of > the designer (or that the "real" exceptions must be > catched and converted into an "acceptable" one, which > looses information). What you need is (to coin a phrase) an 'exception tunnel'. For example, public class SAXExceptionTunnel extends SAXException { private Throwable itsThrowable; public SAXException(Throwable throwable) { itsThrowable = throwable; } public Throwable getThrowable() { return itsThrowable; } public void rethrow() throws Throwable { throw itsThrowable; } } You could use it like this, public class Foo { public void someMethod() throws SAXException { try { // do some stuff } catch(SomeWierdException ex) { throw new SAXExceptionTunnel(ex); } } } Foo someFoo = new Foo(); try { someFoo.someMethod(); } catch(SAXExceptionTunnel ex) { // deal with tunnelled exceptions here System.err.println(ex.getThrowable().getMessage()); ex.rethrow(); } catch(SAXException ex) { // deal with ordinary exceptions here } I've found this idiom extremely useful in many situations where I've wanted to register callback handlers without either imposing restrictions on them (ie. no IOExceptions allowed to leak out of the handler) or having to use ludicrously general throws clauses. Cheers, Miles -- Miles Sabin Cromwell Media Internet Systems Architect 5/6 Glenthorne Mews +44 (0)181 410 2230 London, W6 0LJ msabin@cromwellmedia.co.uk England xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ti64877 at imcnam.sbi.com Wed Feb 17 17:11:28 1999 From: ti64877 at imcnam.sbi.com (Ingargiola, Tito) Date: Mon Jun 7 17:09:09 2004 Subject: ModSAX (SAX 1.1) Proposal Message-ID: <3994C79D0211D211A99F00805FE6DEE249BF8E@exchny15.corp.smb.com> Hello, Thank you for your response and the example you provided. The rationale you provide for having a marker interface is good in that it provides full generality -- *any* kind of handler can be attached to the parser to do *any* kind of thing. Ok. My feeling is that while there may be a slew of PingHandler -like handlers which will do who-knows-what, there will also be a (more common and useful!) set of *Handlers which will do the same kind of things; it would be unfortunate if handlers which did the same work (e.g., validation) had different interfaces simply because no structure was provided for the implementors to follow. (Will J Clark's validation handler have the same interface as IBM's? unlikely. Will they be interchangeable? -- iff my parser knows about both of them...). I see two ways around this difficulty. The first is to define a set of subinterfaces of ModHandler which are adapted for particular uses. This way we have a shared known set of Handlers for our most common uses. Another means to the same end is to have a mechanism by which a parser can lookup (perhaps using the URI mechanism you and D. Brickley have been discussing) specific types of *ModHandlers. Thus a parser can learn about a particular kind of handler on the fly. Both approaches have limitations, I know. The first is difficult because it's hard to define those interfaces a priori. The second, at least looking at CORBA's Dynamic Invocation mechanism as an example, tends to be overly complex and messy. For these reasons, I could certainly see deciding against them, leading us back to (something like) what you proposed originally... but it seems to me both possible and desirable to provide *some* structure in the form of pre-defined ModHandlers. Best Regards, Tito. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From shecter at darmstadt.gmd.de Wed Feb 17 17:25:07 1999 From: shecter at darmstadt.gmd.de (Robb Shecter) Date: Mon Jun 7 17:09:09 2004 Subject: URIs as IDs in ModSAX (was Re: ModSAX (SAX 1.1) Proposal) (fwd) References: Message-ID: <36CAFB14.D3D82EB8@darmstadt.gmd.de> > From: David Megginson > > Justification: > > setFeature needs to be passed an identifying string that > > uniquely picks out some property. Rather than go with the > > pseudo-uri's in Java, it would be more in the spirit of XML > > and Web to use URIs. Using URIs is also more friendly towards > > non-Java implementations of the SAX API. > > I'm strongly inclined to agree. Does anyone have a strong case > against? Hi, I'm joining the discussion a bit late, but it seems to me that allowing something that looks like a URL that's not a URL is a bad idea. This seems to have introduced unnecessary confusion into namespaces: It makes an accompanying explanation and admonishment necessary ("Don't assume there's anything at this 'not-URL' "). From a software engineering point of view, I think it'd be better to chose something that doesn't require the extra documentation - Instead of saying, "Watch out for the problem here...", we should not create the problem in the first place. I think that the Java standard is very good. I don't think that it's unfriendly towards non-Java implementations: it is after all, only a standard, and not hardcoded into the language. The main problem with it is that users who do not have a measure of authority at their organization can have problems. So, for example, I've released some open source code with the package name: org.acm.robb ...because I have the e-mail address robb@acm.org. That works until the ACM decides to put a host named "robb" on the acm network. (OK, that may not happen :) - but the point is valid.) So, I'm for a clear format, maybe Java-like, that --doesn't-- resemble a URL, that solves the above issue. - Robb xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Feb 17 17:32:13 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:09 2004 Subject: ModSAX (SAX 1.1) Proposal In-Reply-To: <3994C79D0211D211A99F00805FE6DEE249BF8E@exchny15.corp.smb.com> References: <3994C79D0211D211A99F00805FE6DEE249BF8E@exchny15.corp.smb.com> Message-ID: <14026.64235.923222.527824@localhost.localdomain> Ingargiola, Tito writes: > Thank you for your response and the example you provided. The > rationale you provide for having a marker interface is good in that > it provides full generality -- *any* kind of handler can be > attached to the parser to do *any* kind of thing. Ok. > My feeling is that while there may be a slew of PingHandler -like > handlers which will do who-knows-what, there will also be a (more > common and useful!) set of *Handlers which will do the same kind of > things; it would be unfortunate if handlers which did the same work > (e.g., validation) had different interfaces simply because no > structure was provided for the implementors to follow. Actually, I wasn't planning on including a validation handler at all -- I'm not even certain what such a thing would do except receive error reports, and that's already covered in the SAX 1.0 ErrorHandler interface. In my design, validation would simply be a feature to be turned on and off; however, if someone else thought up a better approach, they would be free to implement it. > I see two ways around this difficulty. The first is to define a > set of subinterfaces of ModHandler which are adapted for particular > uses. I expect there to be a 1:1 mapping between handler IDs and handler interfaces. For example, "http://xml.org/sax/handlers/namespaces" would always require an object that implements the org.xml.sax.handler.NamespaceHandler interface. I do plan to include a few basic interfaces (LexicalHandler, NamespaceHandler, etc.), but their use will entirely optional. To this extent, then, it looks like we agree. > Another means to the same end is to have a mechanism by which a > parser can lookup (perhaps using the URI mechanism you and > D. Brickley have been discussing) specific types of *ModHandlers. > Thus a parser can learn about a particular kind of handler on the > fly. [ and then, on the disadvantages] > at least looking at CORBA's Dynamic Invocation mechanism as an > example, tends to be overly complex and messy. Agreed -- it has to be simple. Furthermore, we get into very some nasty security problems with this sort of thing, and it's hard to port across languages. I'd use URIs only as unique identifiers, not with the expectation of being able to grab a resource using them. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Daniel.Brickley at bristol.ac.uk Wed Feb 17 19:47:20 1999 From: Daniel.Brickley at bristol.ac.uk (Dan Brickley) Date: Mon Jun 7 17:09:10 2004 Subject: URIs as IDs in ModSAX (was Re: ModSAX (SAX 1.1) Proposal) (fwd) In-Reply-To: <36CAFB14.D3D82EB8@darmstadt.gmd.de> Message-ID: On Wed, 17 Feb 1999, Robb Shecter wrote: > > From: David Megginson > > > Justification: > > > setFeature needs to be passed an identifying string that > > > uniquely picks out some property. Rather than go with the > > > pseudo-uri's in Java, it would be more in the spirit of XML > > > and Web to use URIs. Using URIs is also more friendly towards > > > non-Java implementations of the SAX API. > > > > I'm strongly inclined to agree. Does anyone have a strong case > > against? > > Hi, > > I'm joining the discussion a bit late, but it seems to me that allowing something that looks > like a URL that's not a URL is a bad idea. This seems to have introduced unnecessary > confusion into namespaces: It makes an accompanying explanation and admonishment necessary That was not the proposal. The proposal was to allow SAX features to be individuated using Uniform Resource Identifiers (URIs). URIs are a superset of URLs and URNs, so a non-URL URI might be used. The XML-Data proposal, admirably, used uuid: URIs to make this point. (eg. urn:uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882/ ) Excerpt from RFC 2396 Uniform Resource Identifiers (URI): Generic Syntax available from http://www.isi.edu/in-notes/rfc2396.txt A Uniform Resource Identifier (URI) is a compact string of characters for identifying an abstract or physical resource. [...] This document defines a grammar that is a superset of all valid URI, such that an implementation can parse the common components of a URI reference without knowing the scheme-specific requirements of every possible identifier type > ("Don't assume there's anything at this 'not-URL' "). From a software engineering point of (I'd argue that software engineers who write code that assumes all URLs can be unproblematically de-referenced are asking for trouble. but that's besides the point) > view, I think it'd be better to chose something that doesn't require the extra documentation > - Instead of saying, "Watch out for the problem here...", we should not create the problem in > the first place. We are not creating a problem. It is fine to use a URL to identify a SAX property, but by choosing to allow _all_ forms of URI we leave room for other approaches. This echoes the approach taken by XML namespaces. Again, from the URI spec: Resource A resource can be anything that has identity. Familiar examples include an electronic document, an image, a service (e.g., "today's weather report for Los Angeles"), and a collection of other resources. Not all resources are network "retrievable"; e.g., human beings, corporations, and bound books in a library can also be considered resources ...java classes, perl modules, sax filters are equally 'resources' by this definition. > I think that the Java standard is very good. I don't think that it's unfriendly towards > non-Java implementations: it is after all, only a standard, and not hardcoded into the > language. I disagree. The string 'util.tools.png' is as meaningless in the Java community as in the wider world. The Java package naming convention has the look but not the substance of a hieararchically managed namespace. 'util.tools.png' does not uniquely name anything except within the context of a group of consenting adults who've agreed a set of conventions for doing so. I don't think the Java world have agreed to do this yet. The Web approach, using URIs, seems mor mature than Java's way of doing so. (Maybe a URI scheme for Java package/class naming might be on the cards for Java 3...?) > The main problem with it is that users who do not have a measure of authority at > their organization can have problems. So, for example, I've released some open source code > with the package name: > > org.acm.robb > > ...because I have the e-mail address robb@acm.org. That works until the ACM decides to put a > host named "robb" on the acm network. (OK, that may not happen :) - but the point is valid.) Quite! That's the problem URIs address, at least within the context of the Web. > > So, I'm for a clear format, maybe Java-like, that --doesn't-- resemble a URL, that solves the > above issue. URIs do this. Though I really don't see a problem with using URIs that are URLs, so long as we bear in mind that knowing the URL for a resource is not, and never has been, a guarantee that you'll be able to de-reference it. For example, there may well be an http://intranet.whitehouse.gov/ resource in URL (and hence URI) space, but I'm unlikely to ever be able to access it. Similarly, http://intranet.whitehouse.gov/identifiers/SecretSaxFilter might serve to name a SAX module. The fact that HTTP allows for content negotiation of language & format specific views into resources in the http://* namespace also buys some room to think. You could have english, french and german HTML docs, or text/xml or application/java-serialised-class or whatever, all accessible (or not; dereferencing isn't a right) via the same abstract URL. URLs can be a lot more abstract than people give credit for... Dan ps. one last excerpt from the URI spec; sorry if I'm doing this point to death! from section 1.2. URI, URL, and URN Although many URL schemes are named after protocols, this does not imply that the only way to access the URL's resource is via the named protocol. Gateways, proxies, caches, and name resolution services might be used to access some resources, independent of the protocol of their origin, and the resolution of some URL may require the use of more than one protocol (e.g., both DNS and HTTP are typically used to access an "http" URL's resource when it can't be found in a local cache). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Feb 17 20:14:48 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:10 2004 Subject: URIs as IDs in ModSAX (was Re: ModSAX (SAX 1.1) Proposal) (fwd) In-Reply-To: References: <36CAFB14.D3D82EB8@darmstadt.gmd.de> Message-ID: <14027.7719.58692.373831@localhost.localdomain> Dan Brickley writes: > That was not the proposal. The proposal was to allow SAX features > to be individuated using Uniform Resource Identifiers (URIs). URIs > are a superset of URLs and URNs, so a non-URL URI might be > used. The XML-Data proposal, admirably, used uuid: URIs to make > this point. (eg. urn:uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882/ ) Thanks to Dan for pointing that out to everyone. My comments that follow are addressed not to Dan's points (with which I agree) but to the question of URNs in general. URNs are not suitable for use right now because there is no specification available on how to use them. That means that http://www.megginson.com/ns/foo can be guaranteed unique, but urn:inet:megginson.com:ns:foo cannot (because there's no standard for what 'inet' means). In other words, URNs might be better some day, but they're worthless to us now, and given that people have been talking about them for most of the 1990s without really producing anything, I'm becoming doubtful that they'll ever see light. Can anyone correct me or provide more information? All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Feb 17 20:42:51 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:10 2004 Subject: URIs as IDs in ModSAX References: Message-ID: <36CB29A6.9EEB1E37@locke.ccil.org> Dan Brickley wrote: > I disagree. The string 'util.tools.png' is as meaningless in the Java > community as in the wider world. The Java package naming convention has > the look but not the substance of a hieararchically managed namespace. > 'util.tools.png' does not uniquely name anything except within the > context of a group of consenting adults who've agreed a set of > conventions for doing so. Well, that name is in the "private use" part of the Java package namespace. But "org.ccil.cowan.sax.DOMParser" *is* hierarchically managed: "org.ccil" is a reversed domain name, by the rules of clause 7.7 of the JLS[1]; the "cowan" part is assigned to me by the ccil.org Java package agreement; and the rest is up to me. >From which we learn that every domain ought to have a Java package assignment convention. [1] http://java.sun.com/docs/books/jls/html/7.doc.html#40169 -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Feb 17 20:45:38 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:10 2004 Subject: URIs as IDs in ModSAX (was Re: ModSAX (SAX 1.1) Proposal) (fwd) References: <36CAFB14.D3D82EB8@darmstadt.gmd.de> <14027.7719.58692.373831@localhost.localdomain> Message-ID: <36CB2A17.77CE94A2@locke.ccil.org> David Megginson wrote: > urn:inet:megginson.com:ns:foo > > cannot (because there's no standard for what 'inet' means). Cannot and should not be, because there's no assurance that someone not David may not have megginson.com in the future, and might make conflicting URN assignments. Spam.org has had four owners that I know of. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Feb 17 21:45:30 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:10 2004 Subject: URIs as IDs in ModSAX (was Re: ModSAX (SAX 1.1) Proposal) (fwd) In-Reply-To: <36CB2A17.77CE94A2@locke.ccil.org> References: <36CAFB14.D3D82EB8@darmstadt.gmd.de> <14027.7719.58692.373831@localhost.localdomain> <36CB2A17.77CE94A2@locke.ccil.org> Message-ID: <14027.14305.960707.429826@localhost.localdomain> John Cowan writes: > > urn:inet:megginson.com:ns:foo > > > > cannot (because there's no standard for what 'inet' means). > > Cannot and should not be, because there's no assurance that > someone not David may not have megginson.com in the future, > and might make conflicting URN assignments. > > Spam.org has had four owners that I know of. But there's also no standard for building URNs on uuids, ISBNs, or anything else -- there's just no standard, period. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Ruth at jpl.nasa.gov Wed Feb 17 22:00:21 1999 From: Ruth at jpl.nasa.gov (Ruth Bergman) Date: Mon Jun 7 17:09:10 2004 Subject: from a new member In-Reply-To: <8EE756E49A17D21194860008C7F49AFE0117DD1F@TWRMSG01> Message-ID: <3.0.3.32.19990217140113.00930be0@pop.jpl.nasa.gov> At 07:58 PM 2/17/1999 +0530, Prashanth Lakshmi Narayanan wrote: > i wish some of you could give me pointers as to what sort of application i >should >choose to develop (keeping the time constraint of 2 weeks in mind) so that i >can >bring out the power of xml. > thanks in advance. Like you I am new to xml and I have been doing very much the same thing. I think there are two directions you can think of for choosing an application: something everybody would find useful or something specific to your company's business. If you choose to go with something specific to your company I, obviously, cannot help, but I have an idea for an application that may have a broad interest. The idea is a resume management and styling application. Let me elaborate. I think an xml resume is useful because it would let you store all the relevant information in ONE document. Sometimes it seems that I have more versions of my resume than there are jobs in the universe. An XML resume would have all your work experience, education, references, skills, publications, personal information, etc. You can tag the information (I was thinking with attributes) as to what type of skill it is or what position it may be relevant for. Once you have this complete document you can create a resume to suit each need using stylesheets. It is a natural example to showcase the usefulness of stylesheets, because resumes have a few typical looks. Most people already copy another resume and substitute their information. It is very easy to set up such a resume and stylesheets (I have an example if you're interested). A full application, however, should be accessible to the general public. Thus I think it should provide authoring that hides the xml. I am thinking about dtd driven authoring. (Are there xml editors that provide this already? You'd think so, but all the ones I've tried are non-validating or crash.) It would be interesting to hear from more experienced members on this list if they think xml resumes will sell (that is, if they will sell managers on xml). Cheers, Ruth. ------------------------------------------------------------------ Dr. Ruth Bergman ruth@jpl.nasa.gov NASA Jet Propulsion Laboratory Mail Stop 301-180 4800 Oak Grove Dr. Pasadena, CA 91109-8099 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Feb 17 22:21:07 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:10 2004 Subject: URIs as IDs in ModSAX (was Re: ModSAX (SAX 1.1) Proposal) (fwd) References: <36CAFB14.D3D82EB8@darmstadt.gmd.de> <14027.7719.58692.373831@localhost.localdomain> <36CB2A17.77CE94A2@locke.ccil.org> <14027.14305.960707.429826@localhost.localdomain> Message-ID: <36CB40B0.F8C85B9E@locke.ccil.org> David Megginson wrote: > But there's also no standard for building URNs on uuids, ISBNs, or > anything else -- there's just no standard, period. Agreed. My point was that there *should* be no way to construct an URN from a domain name alone (domain name plus creation date, anybody?). -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Wed Feb 17 22:37:06 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:09:10 2004 Subject: from a new member Message-ID: Ruth wrote: > Let me elaborate. I think an xml resume is useful because it > would let you > store all the relevant information in ONE document. I am currently putting together exactly such a thing!! I was looking for an application that would provide a public site that would 'run itself', the idea being that people could upload stuff and others could download it either as raw XML or as styled XML, but with no intervention on our part other than perhaps the odd enhancement. We have all the components we need from another project, but that project only 'runs' one way - i.e. we disseminate information, but we don't retrieve it. Hence ... the CV idea! With a few different stylesheets you could have a plain CV, a summary CV and a brightly coloured one with your photo in. You could even have a stylesheet that exports your personal details as a vCard (there's an XML DTD for that). So, if you are prepared to wait a month or so, I'd say hang fire. And if you - or anyone else - wants to suggest what elements and attributes should be in a CV then let me know. I'd like to eventually create a CV DTD. Regards, Mark Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Wed Feb 17 22:49:14 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:10 2004 Subject: ModSAX (SAX 1.1) Proposal Message-ID: <00da01be5ac7$8055c910$2ee044c6@arcot-main> David, >What exactly do you mean by a "module"? A feature by a different name and from the feature provider point of view. I could have just as well used "Feature" instead of "Module". I just assumed that the "Mod" in "ModAX" was for "Module". >In general, we don't have get* methods for org.xml.sax.Parser -- is >there a special reason that we need one here? It is just my style, ying and yang sort of thing. I did not see any compelling reason not to have it and there are some benefits such as chain (as in chain of responsibilities) management. >It certainly looks cleaner than checking a lot of boolean return >values, and it provides stronger compile-time checking as well. True if the feature in question is mandatory. Code ends up a little messier if the feature is optional. If I just wanted to install different handler types depending on whether a feature is available or not, we end up with many try-catch islands. Anyway, I like your latest proposal. Best, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rick at activated.com Thu Feb 18 02:55:41 1999 From: rick at activated.com (Rick Ross) Date: Mon Jun 7 17:09:10 2004 Subject: Immediate opening for XSL Pro! Message-ID: <36CB811B.EE0A3439@activated.com> Sorry to send this to the list, but a pressing need has emerged. I have an outstanding position available for one, or perhaps two, Java developers who are genuinely skilled in server-side application development, OO-design, XML and most importantly XSL. The contract work will last at least 4-5 weeks, one of which will require travel to Budapest, Hungary. Pay is excellent, but skill, professionalism, and a successful result are required. If you are interested and feel you have the qualifications, then please email me directly at rick@activated.com. Thanks, Rick Ross Activated Intelligence xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Thu Feb 18 09:12:33 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:09:10 2004 Subject: CV/Resume Message-ID: <5F052F2A01FBD11184F00008C7A4A80001136B85@eukbant101.ericsson.se> Looks like I'm not the only one doing this... A couple of things to be aware of are that US Resume's can be quite different to UK ones. I found this out when designing my DTD. If you want my DTD (it's publicly available under the Artistic licence - so feel free to modify it and send back changes) mail me at msergeant@ndirect.co.uk as it's at home and I'm at work ATM. One thing I don't have is any sort of style sheet built yet, but I do have in progress a CGI (well, mod_perl really) script for editing CV's online. Unfortunately I don't have a scriptable web site so I can't demo it. When it's complete it will be free though. Matt. -- http://come.to/fastnet Perl on Win32, PerlScript, ASP, Database, XML GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ > -----Original Message----- > From: Mark Birbeck [SMTP:Mark.Birbeck@iedigital.net] > Sent: Wednesday, February 17, 1999 10:46 PM > To: 'Ruth Bergman' > Cc: 'xml-dev@ic.ac.uk' > Subject: RE: from a new member > > Ruth wrote: > > Let me elaborate. I think an xml resume is useful because it > > would let you > > store all the relevant information in ONE document. > > I am currently putting together exactly such a thing!! > > I was looking for an application that would provide a public site that > would 'run itself', the idea being that people could upload stuff and > others could download it either as raw XML or as styled XML, but with no > intervention on our part other than perhaps the odd enhancement. > > We have all the components we need from another project, but that > project only 'runs' one way - i.e. we disseminate information, but we > don't retrieve it. Hence ... the CV idea! With a few different > stylesheets you could have a plain CV, a summary CV and a brightly > coloured one with your photo in. You could even have a stylesheet that > exports your personal details as a vCard (there's an XML DTD for that). > > So, if you are prepared to wait a month or so, I'd say hang fire. And if > you - or anyone else - wants to suggest what elements and attributes > should be in a CV then let me know. I'd like to eventually create a CV > DTD. > > Regards, > > Mark > > Mark Birbeck > Managing Director > Intra Extra Digital Ltd. > 39 Whitfield Street > London > W1P 5RE > w: http://www.iedigital.net/ > t: 0171 681 4135 > e: Mark.Birbeck@iedigital.net > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Livinsb at rbos.co.uk Thu Feb 18 09:59:02 1999 From: Livinsb at rbos.co.uk (Livingstone, Stephen) Date: Mon Jun 7 17:09:11 2004 Subject: FW: CV/Resume Message-ID: <217258E84FF7CF11B4630001FA44B2D502CF0527@REFROWTECX1> > I've been working on doing this as well, using XML,XSL and IE 5. > > Assuming the list frowards attatchments, here is what I hvae done so > far. > > It's not that advanced, but DOES work accross ALL browsers assuming > your server is IIS 4.0 with MSXML registered. > > Steven > > > > Steven Livingstone BSc MSc GradInstP > Corporate Systems Development (TCN) > Royal Bank Of Scotland. > *: mailto:livinsb@rbos.co.uk > *: +44 0131 523 4354 [x24354] > > Networking Technical Associates, > Glasgow, Scotland. > *: mailto:ntw_uk@hotmail.com > *: +44 07771-957-280 > > -----Original Message----- > From: Matthew Sergeant (EML) [SMTP:Matthew.Sergeant@eml.ericsson.se] > Sent: Thursday, February 18, 1999 9:07 AM > To: 'Mark Birbeck'; 'Ruth Bergman' > Cc: 'xml-dev@ic.ac.uk' > Subject: RE: CV/Resume > > > *** Warning : this message originates from the Internet **** > > Looks like I'm not the only one doing this... > > A couple of things to be aware of are that US Resume's can be quite > different to UK ones. I found this out when designing my DTD. If you > want my > DTD (it's publicly available under the Artistic licence - so feel free > to > modify it and send back changes) mail me at msergeant@ndirect.co.uk as > it's > at home and I'm at work ATM. > > One thing I don't have is any sort of style sheet built yet, but I do > have > in progress a CGI (well, mod_perl really) script for editing CV's > online. > Unfortunately I don't have a scriptable web site so I can't demo it. > When > it's complete it will be free though. > > Matt. > -- > http://come.to/fastnet > Perl on Win32, PerlScript, ASP, Database, XML > GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V > > !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ > > > -----Original Message----- > > From: Mark Birbeck [SMTP:Mark.Birbeck@iedigital.net] > > Sent: Wednesday, February 17, 1999 10:46 PM > > To: 'Ruth Bergman' > > Cc: 'xml-dev@ic.ac.uk' > > Subject: RE: from a new member > > > > Ruth wrote: > > > Let me elaborate. I think an xml resume is useful because it > > > would let you > > > store all the relevant information in ONE document. > > > > I am currently putting together exactly such a thing!! > > > > I was looking for an application that would provide a public site > that > > would 'run itself', the idea being that people could upload stuff > and > > others could download it either as raw XML or as styled XML, but > with no > > intervention on our part other than perhaps the odd enhancement. > > > > We have all the components we need from another project, but that > > project only 'runs' one way - i.e. we disseminate information, but > we > > don't retrieve it. Hence ... the CV idea! With a few different > > stylesheets you could have a plain CV, a summary CV and a brightly > > coloured one with your photo in. You could even have a stylesheet > that > > exports your personal details as a vCard (there's an XML DTD for > that). > > > > So, if you are prepared to wait a month or so, I'd say hang fire. > And if > > you - or anyone else - wants to suggest what elements and attributes > > should be in a CV then let me know. I'd like to eventually create a > CV > > DTD. > > > > Regards, > > > > Mark > > > > Mark Birbeck > > Managing Director > > Intra Extra Digital Ltd. > > 39 Whitfield Street > > London > > W1P 5RE > > w: http://www.iedigital.net/ > > t: 0171 681 4135 > > e: Mark.Birbeck@iedigital.net > > > > > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > > CD-ROM/ISBN 981-02-3594-1 > > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > > (un)subscribe xml-dev > > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > > message; > > subscribe xml-dev-digest > > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer.. 'Internet e-mails are not necessarily secure. The Royal Bank of Scotland plc does not accept responsibility for changes made to this message after it was sent.' -------------- next part -------------- A non-text attachment was scrubbed... Name: ExpView.xsl Type: application/octet-stream Size: 1344 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990218/3bc08fc8/ExpView.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: skills.css Type: application/octet-stream Size: 25 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990218/3bc08fc8/skills.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: skills.xml Type: application/octet-stream Size: 3763 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990218/3bc08fc8/skills-0001.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: skills.xsl Type: application/octet-stream Size: 2074 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990218/3bc08fc8/skills-0002.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: skillsView.xsl Type: application/octet-stream Size: 981 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990218/3bc08fc8/skillsView.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: skills.asp Type: application/octet-stream Size: 2056 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990218/3bc08fc8/skills-0003.obj From shecter at darmstadt.gmd.de Thu Feb 18 12:00:53 1999 From: shecter at darmstadt.gmd.de (Robb Shecter) Date: Mon Jun 7 17:09:11 2004 Subject: URIs as IDs in ModSAX (was Re: ModSAX (SAX 1.1) Proposal) (fwd) References: Message-ID: <36CC00AD.BFF65490@darmstadt.gmd.de> Dan Brickley wrote: > On Wed, 17 Feb 1999, I wrote: > > > ("Don't assume there's anything at this 'not-URL' "). From a software engineering point of > view... > > (I'd argue that software engineers who write code that assumes all URLs > can be unproblematically de-referenced are asking for trouble. but > that's besides the point) > Yes, but I'm not assuming that the users / app. programmers / clients of these standards are "software engineers". I'm assuming that we are, though. :) I think that the truly successul and universal systems are those that can be used by biologists, teachers, linguists, etc. Mathematically or abstractly, HTML, Perl and CGI are not beautiful creations, but they are understandable and consistent in the right ways. And they are -still- regretfully, the standard for www applications and documents. I guess that a basic problem I have is that seeing something like: http://a.b.c/d/e that doesn't reference a retrievable resource is misleading and confusing because of the "http://". Why are we saying how to retrieve this thing that cannot be retrieved? Millions of people in the world now understand that this prefix speficies how to get something. Now we want to say that that's not quite true; it's an "abstraction". In some cases. I don't think that will work. I don't think people will get it, or have time or want to get it. To me this is a design question: Buttons on UI's should look "actionable" and text labels should not. Doors in buildings should not have handles on the inside when people are required to push to open them. etc. > > - Instead of saying, "Watch out for the problem here...", we should not create the problem in > > the first place. > > We are not creating a problem. It is fine to use a URL to identify a SAX > property, but by choosing to allow _all_ forms of URI we leave room for > other approaches. This echoes the approach taken by XML namespaces. > Yes, and I think that namespaces is confusing. Maybe I haven't studied it rigorously enough. And, after reading about Dublin Core, and its use of a trailing "#" on the namespace value, I'm confused more. Especially seeing at least one site that doesn't do this. ( http://dmoz.org/rdf.html ) > > I think that the Java standard is very good. I don't think that it's unfriendly towards > > non-Java implementations: it is after all, only a standard, and not hardcoded into the > > language. > > I disagree. The string 'util.tools.png' is as meaningless in the Java > community as in the wider world... I may or may not agree, but I don't see how the Java standard is unfriendly towards other languages. For example, there's no special language feature for parsing strings with dots as tokens. - Robb xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Thu Feb 18 12:03:36 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:09:11 2004 Subject: CV/Resume Message-ID: Matthew Sergeant wrote: > A couple of things to be aware of are that US Resume's can be quite > different to UK ones. I found this out when designing my DTD. > If you want my > DTD (it's publicly available under the Artistic licence - so > feel free to > modify it and send back changes) mail me at > msergeant@ndirect.co.uk as it's > at home and I'm at work ATM. Would definitely like a look. > One thing I don't have is any sort of style sheet built yet, > but I do have > in progress a CGI (well, mod_perl really) script for editing > CV's online. We have a general tool that we use for our clients that uses JavaScript to traverse a tree and at each node you can add more nodes, edit the current one or delete. Each node is in the database and is exported as XML - so as far as data entry goes that's covered. > Unfortunately I don't have a scriptable web site so I can't > demo it. When > it's complete it will be free though. I'm planning the same (free that is). I also thought a cute little side effect would be to use the vCard standard and we could all pass our business cards to each other using a URL with XSL like syntax, like: http://www.somevcardlikename.com/vcard[id=32] on the end of our emails. So rather than passing their actual details, you pass a pointer to your details. Any comments? (And *another* cute spin-off is that people could get therefore create their own namespace URI, as discussed in some previous threads.) Anyway, the aim is still to produce a site that 'runs itself' so if you have any other features it should have, feel free to mention them. Regards, Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From RDaniel at DATAFUSION.net Thu Feb 18 16:59:12 1999 From: RDaniel at DATAFUSION.net (Ron Daniel) Date: Mon Jun 7 17:09:11 2004 Subject: URIs as IDs in ModSAX (was Re: ModSAX (SAX 1.1) Proposal) (fw d) Message-ID: <0D611E39F997D0119F9100A0C931315C2E0FA1@datafusionnt1> David is mostly correct when he says that there are no standards for making URNs from ISBNs, UUIDs, etc. However, there is actually some pretty good indication of what those standards will look like. ISBNS (and a couple of other bibliographic identifiers) as URNs are described in the RFC 2288, "Using Existing Bibliographic Identifiers as Uniform Resource Names" http://info.internet.isi.edu:80/in-notes/rfc/files/rfc2288.txt Caveats: that is an informational RFC, not a standards-track one, although there is a desire to standardize the URN encoding of such identifiers. The question surrounding the ISBNs is whether the namespace should be 'isbn' or if ISO might want to use something else. (Truth in advertising disclosure - I'm a co-author of that document). Of more interest to the discussion in this group is the prospect of using UUIDs (or GUIDs) as identifiers. (UUIDs and GUIDs are described in the now-expired Internet-draft http://search.ietf.org/internet-drafts/draft-leach-uuids-guids-01.txt ) A uuid: URI scheme is proposed in another expired draft: http://search.ietf.org/internet-drafts/draft-kindel-uuid-uri-00.txt Although I can't turn it up right now, I've seen a fair number of urn:guid:ugly-guid-string-here style identifiers. Those seem unlikely to change. The real bottleneck in URN schemes is in the registry for URN namespaces. I've not been keeping up with that, but the last time I checked they were specifying the info that had to be provided when registering a namespace. The assumption was that IANA would run the registry similar to the media types registry. Regards, Ron Daniel > -----Original Message----- > From: David Megginson [SMTP:david@megginson.com] > Sent: Wednesday, February 17, 1999 1:44 PM > To: XML Dev > Subject: Re: URIs as IDs in ModSAX (was Re: ModSAX (SAX 1.1) > Proposal) (fwd) > > John Cowan writes: > > > > urn:inet:megginson.com:ns:foo > > > > > > cannot (because there's no standard for what 'inet' means). > > > > Cannot and should not be, because there's no assurance that > > someone not David may not have megginson.com in the future, > > and might make conflicting URN assignments. > > > > Spam.org has had four owners that I know of. > > But there's also no standard for building URNs on uuids, ISBNs, or > anything else -- there's just no standard, period. > > > All the best, > > > David > > -- > David Megginson david@megginson.com > http://www.megginson.com/ > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tma at chiron.pt Thu Feb 18 17:16:51 1999 From: tma at chiron.pt (Tiago Moitinho de Almeida) Date: Mon Jun 7 17:09:11 2004 Subject: XML DTD vs. XML-Data Message-ID: <002601be5b62$014786e0$a64517c3@chiron.pt> I am developing an application for editing and validating formal metadata documents, which is XML-based. The metadata documents follow a structure which is represented by a DTD. The application is DTD-deriven. It should work with any valid DTD, so it does not depend on a specific DTD. 1. Two important issues about the validation are: - data type checking (ex: the content of tags should have a valid date format) - validity domains (ex: for the content of I should only accept "blue", "red" or "green"). As far as I undertand, XML-Data provides solution for the 1st issue, and I can work arround it in a DTD using a #FIXED attribute for those element declarations. However, I don?t see how XML-data or XML DTD provide information to meet the 2nd issue. 2. 2 questions about XML DTD's and XML-Data: - Can I use DTD to do everything that XML-Data does? - Using XML-Data, can I somehow get rid of XML DTD's? Thanks. Tiago Moitinho, Lisbon, Portugal tma@chiron.pt xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jmg at trivida.com Thu Feb 18 17:55:50 1999 From: jmg at trivida.com (Jeff Greif) Date: Mon Jun 7 17:09:11 2004 Subject: XML DTD vs. XML-Data Message-ID: <010d01be5b67$a3eb52a0$a24630d1@greif.trivida.com> 1. You can do enumeration-based validation using attributes in your DTD, e.g. You might be able to use one of the schema proposals (XSchema, SOX, DCD which probably supersedes XML-Data), but I think they're all weak in this area from my limited skimming of their current drafts. The RDF proposal certainly has the power to handle all kinds of data typing, but RDF is pretty complicated. 2. Different schema proposals had different goals in respect to convertibility back and forth to DTDs. Ron Bourret's excellent slide show (with extensive annotations) discusses these issues. See the presentation at: HTML: http://www.informatik.tu-darmstadt.de/DVS1/staff/bourret/xml/xmlschemas/inde x.htm PowerPoint: http://www.informatik.tu-darmstadt.de/DVS1/staff/bourret/xml/xmlschemas/XMLS chemas.ppt Jeff -----Original Message----- From: Tiago Moitinho de Almeida To: XML Dev Date: Thursday, February 18, 1999 9:22 AM Subject: XML DTD vs. XML-Data ... >1. >Two important issues about the validation are: >- data type checking (ex: the content of tags should have a valid date >format) >- validity domains (ex: for the content of I should only >accept "blue", "red" or "green"). > >As far as I undertand, XML-Data provides solution for the 1st issue, and I can >work arround it in a DTD using a #FIXED attribute for those element >declarations. > >However, I don?t see how XML-data or XML DTD provide information to meet the 2nd >issue. > > >2. >2 questions about XML DTD's and XML-Data: >- Can I use DTD to do everything that XML-Data does? >- Using XML-Data, can I somehow get rid of XML DTD's? > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Thu Feb 18 20:09:07 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:11 2004 Subject: URIs as IDs in ModSAX (was Re: ModSAX (SAX 1.1) Proposal) (fwd) Message-ID: <008601be5b7a$4e264060$2ee044c6@arcot-main> Whether my opinion is based on ignorance or optimism, I have no problem with using URI for feature names. As long as people make some effort to protect ourselves, I think URI will serve our needs. Common feature names can be maintained at one sight and some reasonable rules about how to insure uniqueness. As far as standards for URI goes, it is my belief that if there is no standard then there is no great need for the standard. Please don't forget to discount by 25% to account for my usual rate of exaggeration. Lets move on. Best, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Thu Feb 18 21:08:15 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:09:11 2004 Subject: Office 2000 and XML Message-ID: <36CC7F71.C5544218@prescod.net> Can anyone point me to a Microsoft statement about the level of support that Office 2000 will have for XML? I think I have a good idea of what it will support (based on betas), but I would like to have something official to point to. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Intel has a big bull's-eye on its forehead because everyone is gunning for it. But they have to be as nimble and aggressive as they were when they were a small company," Mr. Howe said. "It's easy to be nimble when you're a $5-billion company, it's a whole other thing when you're a $40-billion company." - http://www.globeandmail.ca/gam/ROB/19990217/RPENT.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jmcdonou at library.berkeley.edu Thu Feb 18 22:40:15 1999 From: jmcdonou at library.berkeley.edu (Jerome McDonough) Date: Mon Jun 7 17:09:11 2004 Subject: Office 2000 and XML In-Reply-To: <36CC7F71.C5544218@prescod.net> Message-ID: <3.0.5.32.19990218143430.00c657f0@library.berkeley.edu> At 03:00 PM 2/18/1999 -0600, Paul Prescod wrote: >Can anyone point me to a Microsoft statement about the level of support >that Office 2000 will have for XML? I think I have a good idea of what it >will support (based on betas), but I would like to have something official >to point to. >-- There's a Microsoft white paper on Office 2000 and HTML which includes a discussion of their use of XML in Office 2000. It's available at http://www.microsoft.com/office/2000/Office/Documents/applhtml.doc. Jerome McDonough -- jmcdonou@library.Berkeley.EDU | (......) Library Systems Office, 386 Doe, U.C. Berkeley | \ * * / Berkeley, CA 94720-6000 (510) 642-5168 | \ <> / "Well, it looks easy enough...." | \ -- / SGNORMPF!!! -- From the Famous Last Words file | |||| xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Thu Feb 18 23:55:28 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:09:11 2004 Subject: XML Mail References: <3.0.6.32.19990204061222.00f1de60@scripting.com> <3.0.6.32.19990204172738.00bfbec0@daemsg01> <4EB4181A.99BB7C0@darmstadt.gmd.de> Message-ID: <36CCA778.3C6C6955@manhattanproject.com> Are there any parsers that will accept 'mail' and generate corresponding XML? I would imagine putting this in /etc/aliases : xml-files: "!/opt/xmlmail/bin/rewrite ... " The rewrite program would: 0) Transform reserved characters > < & into > < & and handle other trivial conversions such as this. 1) Transform the 'headers' into XML structure. 2) Leave valid XML/HTML alone if possible. 3) Add a and

for non-xml non-html mail. etc. Anything like this monster? Thanks tons! Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Feb 19 01:07:48 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:11 2004 Subject: XML Mail In-Reply-To: <36CCA778.3C6C6955@manhattanproject.com> References: <3.0.6.32.19990204061222.00f1de60@scripting.com> <3.0.6.32.19990204172738.00bfbec0@daemsg01> <4EB4181A.99BB7C0@darmstadt.gmd.de> <36CCA778.3C6C6955@manhattanproject.com> Message-ID: <14028.47174.186916.359779@localhost.localdomain> Clark Evans writes: > Are there any parsers that will accept 'mail' and > generate corresponding XML? > The rewrite program would: > > 0) Transform reserved characters > < & into > > < & and handle other trivial > conversions such as this. Of course. > 1) Transform the 'headers' into XML structure. > sender="clark.evans@manhattanproject.com" > to.1="xml-dev@ic.ac.uk" > etc. > > Sounds reasonable. > 2) Leave valid XML/HTML alone if possible. Wrong -- or, to put it differently, it should leave content with text/html and text/xml alone, but it should not try to recognise markup in text/plain. > 3) Add a and

for non-xml > non-html mail. I don't think that adding

...

is a good idea -- if the body of the message is text/plain, then it should be treated as a blob of text. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Feb 19 01:13:47 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:11 2004 Subject: Announcement: XML Infoset Requirements Published Message-ID: <14028.47599.269952.835288@localhost.localdomain> I am happy to announce that the W3C's XML Information Set Working Group has published its Requirements and Design Principles document at the following location: http://www.w3.org/TR/NOTE-xml-infoset-req As specified in the document, please send comments to www-xml-infoset-comments@w3.org, which is a publicly-archived mailing list. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Fri Feb 19 01:26:52 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:11 2004 Subject: A question for developers Message-ID: <3.0.32.19990218172639.00a535a0@pop.intergate.bc.ca> In section 1.2, the XML spec says Characters with multiple possible representations in ISO/IEC 10646 (e.g. characters with both precomposed and base+diacritic forms) match only if they have the same representation in both strings. At user option, processors may normalize such characters to some canonical form. Does any real XML software actually do such normalization? Thanks in advance. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Fri Feb 19 02:13:50 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:09:11 2004 Subject: XML Mail References: <3.0.6.32.19990204061222.00f1de60@scripting.com> <3.0.6.32.19990204172738.00bfbec0@daemsg01> <4EB4181A.99BB7C0@darmstadt.gmd.de> <36CCA778.3C6C6955@manhattanproject.com> <36CCBDEC.4FA3E64A@binevolve.com> Message-ID: <36CCC7EB.974AA70F@manhattanproject.com> David Megginson wrote: > > 2) Leave valid XML/HTML alone if possible. > > Wrong -- or, to put it differently, it should leave content with > text/html and text/xml alone, but it should not try to recognise > markup in text/plain. In theory I agree, however, practical circumstances dictate otherwise. :) I have been showing associates (end users) how to markup their e-mail using XML. I hope to classify e-mail according to the markup contained within, and use it to update a database with such information. Furthermore, I have found so far, that my end users are not "resisting" XML as much as I would have thought. As long as I stay practical, and explain clearly that the tags are used to tell the computer program how to deal with their information, everything goes well. Thus, I intend to use the e-mail to populate a database, based upon it's contents. Anyway, I would like to proceed with this experiment to see how it works in reality. Eventually I picture an structure-aware-e-mailer that replaces traditional forms based processing with a more stream oriented approach. The editor would allow multiple DTD's to be validated against their mail before they send it. I see this as logical evolution from centralized forms based processing to a more flexibie, distributed information system. For now, I can tell my end-users not to use < > and & unless they are doing markup, but my biggest problem so far is the darn e-mailers which use > to mark e-mail that is forwarded... I'd love to change the character to | Anyway, hope this makes sence. Comments? > > 3) Add a and

for non-xml > > non-html mail. > > I don't think that adding

...

is a good idea -- if the body of > the message is text/plain, then it should be treated as a blob of > text. Perhaps you are right.. although I like what Parand did: Parand Tony Darugar wrote: > > The rewrite program would: > > > > 0) Transform reserved characters > < & into > > > < & and handle other trivial > > conversions such as this. > > Does this. For the main message, I wrap it in a > big CDATA[] and leave the reserved stuff unchanged. > > > 1) Transform the 'headers' into XML structure. > > Does this also. See example output below. > > >
>
> > Hope everyone likes it. We had to make it under 10k which restricted us > quite a bit. > ]]> >
> ------------------------------------------------------------ This is very nice. > I promised someone I'd package it and put it somewhere > on the web months ago, but I've procrastinated. I guess I > should really do that. Wonderful. Please do. Best, Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Fri Feb 19 03:13:05 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:09:11 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 Message-ID: <36CCD5C1.33353EA6@manhattanproject.com> > Editors: David Megginson(david@megginson.com) Thank you David. It looks like you did a great job making the document clear. I only have one (long) comment that I have made several times on xml-dev. Sorry for those who have heard my soap box before. :) Clark Evans > Abstract: > > This document lists the design principles and requirements for > the XML Information Set, a meta-model for XML documents Stating that XML is a *document* standard effectively precludes this XML standard from a very useful application as stream markup. This is a very subtle bias which, in my opinion, will severly damage the XML standard if it continues. Now.. I *do* like the word "infoset" beacuse it is much more illustrative of what I see the goal as being: A way to represent document subsets or stream fragements. This view of an "infoset" will reap hudge rewards, the implicit pre-requesite of the entire document being present before the infoset has value being removed. Suggestion: Strike "document" and replace with "document subset". Where "subset" includes not only proper subsets, but also the possibility of the infoset representing the entire document. In this way, you reserve the much more powerful ability to focus on the XML as a stream and breaking it into manageable chunks based upon the needs of the processing tool. Picture a stack-based mechinism, where the "smallest" document fragement which can satisfy the needs of the transformation or process in question is kept in a multi-pass storage, allowing the remainder of the information to be handled using a single-pass mechanism. It is the ballence between the two styles that generates power. Too much one way or the other way will lead to inneficient systems. > The XML Information Set will be purely descriptive: it will > identify a common set of abstract XML information without > mandating a single type of processing behaviour or a specific > API for XML-based software. Good. So it will *not* require the entire document to be available in a multi-pass storage mechinism? > 2. Design Principles > > 1.The XML Information Set shall provide an abstract model > for describing the logical structure of a well-formed XML > 1.0 document (note that all valid XML 1.0 documents are > also, by definition, well-formed). Does this provide an abstract model for describing the logical structure of a well-formed document SUBSET ? > 5.The XML Information Set shall be designed to be > interoperable with the W3C's DOM Level 1 > Recommendation [DOM] and, as far as possible, with the > XPointer Working Draft [XPointer], and with the XSL > Working Draft [XSL]. Why not SAX? Clearly the event-driven nature of an XML stream is important. Will the standard support "push" event-driven systems as well as "pull" object-oriented systems? > 3.The XML Information Set shall contain sufficient > information for the creation of a well-formed XML > document. Or stream fragement? or sub-document? I'm only harping beacuse not recognizing the other way to do things will severly limit the usefulness of the resulting product. > 4.The XML Information Set shall contain sufficient > information to define equivalence for XML documents > based on their logical structure. This I look forward to seeing. Isomorphism could be very powerful. I hope that a multi-pass mechanism is not required for this feature. Or if it is, the multi-pass requirement being limited to certain cases of isomorphic forms. > > 4. References > > DOM > W3C (World Wide Web Consortium). Document Object > Model (DOM) Level 1 Specification Recommendation. > Version 1.0. [Cambridge, MA]. > http://www.w3.org/TR/REC-DOM-Level-1 > XML > W3C (World Wide Web Consortium). Extensible Markup > Language (XML) Recommendation. Version 1.0. > [Cambridge, MA]. http://www.w3.org/TR/REC-xml > XSL > W3C (World Wide Web Consortium). Extensible Stylesheet > Language (XSL) Working Draft. Version 1.0. [Cambridge, > MA]. http://www.w3.org/TR/WD-xsl > XPointer > W3C (World Wide Web Consortium). XML Pointer > Language (XPointer) Working Draft. [Cambridge, MA]. > http://www.w3.org/TR/WD-xptr You forgot SAX and SAXON. I find it rude not to take into account this non W3C standard. It severely undervalues David's hudge contribution to the XML community. Clark Evans xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Fri Feb 19 04:13:52 1999 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:09:11 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 References: <36CCD5C1.33353EA6@manhattanproject.com> Message-ID: <36CCE4D6.BAD5A8F5@allette.com.au> Clark Evans wrote: > > This document lists the design principles and requirements for > > the XML Information Set, a meta-model for XML documents > > Stating that XML is a *document* standard effectively > precludes this XML standard from a very useful > application as stream markup. This is a very subtle > bias which, in my opinion, will severly damage the XML > standard if it continues. I think you might be applying a meaning to that phrase that it doesn't deserve - it doesn't call XML a document standard, it uses the term "XML document", with document defined in the XML recommendation as: "A data object is an XML document if it is well-formed, as defined in this specification. A well-formed XML document may in addition be valid if it meets certain further constraints." This allows you to use the phrases "XML data object" and an "XML document" interchangeably. This isn't incongruous with stream markup - you just need to consider the stream as an XML document. :-) Seriously though, you probably wouldn't have the same concerns about "XML data object" - if you prefer that term, you should use it. Really, this is more a case of becoming comfortable with the evolving notion (and convenient terminology) of what a document is. -- Regards, Marcus Carr email: mrc@allette.com.au ___________________________________________________________________ Allette Systems (Australia) www: http://www.allette.com.au ___________________________________________________________________ "Everything should be made as simple as possible, but not simpler." - Einstein xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Fri Feb 19 04:21:51 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:09:12 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 In-Reply-To: <36CCE4D6.BAD5A8F5@allette.com.au> Message-ID: Hi, I am late in the process (just new in the list) but where can I find the document you are talking about. thanks Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of Marcus Carr Sent: Thursday, February 18, 1999 11:13 PM To: xml-dev@ic.ac.uk Subject: Re: XML Information Set Requirements, W3C Note 18-February-1999 Clark Evans wrote: > > This document lists the design principles and requirements for > > the XML Information Set, a meta-model for XML documents > > Stating that XML is a *document* standard effectively > precludes this XML standard from a very useful > application as stream markup. This is a very subtle > bias which, in my opinion, will severly damage the XML > standard if it continues. I think you might be applying a meaning to that phrase that it doesn't deserve - it doesn't call XML a document standard, it uses the term "XML document", with document defined in the XML recommendation as: "A data object is an XML document if it is well-formed, as defined in this specification. A well-formed XML document may in addition be valid if it meets certain further constraints." This allows you to use the phrases "XML data object" and an "XML document" interchangeably. This isn't incongruous with stream markup - you just need to consider the stream as an XML document. :-) Seriously though, you probably wouldn't have the same concerns about "XML data object" - if you prefer that term, you should use it. Really, this is more a case of becoming comfortable with the evolving notion (and convenient terminology) of what a document is. -- Regards, Marcus Carr email: mrc@allette.com.au ___________________________________________________________________ Allette Systems (Australia) www: http://www.allette.com.au ___________________________________________________________________ "Everything should be made as simple as possible, but not simpler." - Einstein xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Fri Feb 19 04:28:30 1999 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:09:12 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 References: Message-ID: <36CCE843.8B0F29AF@allette.com.au> Didier PH Martin wrote: > I am late in the process (just new in the list) but where can I find the > document you are talking about. http://www.w3.org/TR/NOTE-xml-infoset-req -- Regards, Marcus Carr email: mrc@allette.com.au ___________________________________________________________________ Allette Systems (Australia) www: http://www.allette.com.au ___________________________________________________________________ "Everything should be made as simple as possible, but not simpler." - Einstein xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Fri Feb 19 04:59:31 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:09:12 2004 Subject: FW: XML Mail Message-ID: <000101be5bc3$e9e1c3f0$d3228018@jabr.ne.mediaone.net> Clark Evans wrote: > > > Are there any parsers that will accept 'mail' and > generate corresponding XML? XMTP is an XML <-> MIME mapping which transforms SMTP messages into XML. The mapping is described at http://jabr.ne.mediaone.net/documents/xmtp.htm If you send a message to test-xmtp@jabr.ne.mediaone.net it will autorespond, inserting the XMLized message (headers and body) in the response body. This works with binary attachments as well. If you send a message to XMTP-BOARD@jabr.ne.mediaone.net you can see it via XML/XSL by browsing: http://jabr.ne.mediaone.net/xmtp/listxmtp.asp?User=xmtp-board (this uses IE5b2 XSL). Valid HTML can't be left alone because this is not required to be well formed XML hence HTML is escaped. XML is left alone if the content-type=text/xml Enjoy, Jonathan Borden http://jabr.ne.mediaone.net > > I would imagine putting this in /etc/aliases : > > xml-files: "!/opt/xmlmail/bin/rewrite ... " > > > The rewrite program would: > > 0) Transform reserved characters > < & into > > < & and handle other trivial > conversions such as this. > > 1) Transform the 'headers' into XML structure. > sender="clark.evans@manhattanproject.com" > to.1="xml-dev@ic.ac.uk" > etc. > > > > 2) Leave valid XML/HTML alone if possible. > > 3) Add a and

for non-xml > non-html mail. > > etc. > > Anything like this monster? > > Thanks tons! > > Clark > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Fri Feb 19 05:13:11 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:09:12 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 References: <36CCD5C1.33353EA6@manhattanproject.com> <36CCE4D6.BAD5A8F5@allette.com.au> Message-ID: <36CCF1EB.CD64B705@manhattanproject.com> Marcus Carr wrote: > I think you might be applying a meaning to that > phrase that it doesn't deserve - it doesn't call XML > a document standard, it uses the term "XML document", > with document defined in the XML recommendation as: > > "A data object is an XML document if it is well-formed, > as defined in this specification. A well-formed XML document > may in addition be valid if it meets certain further constraints." > > This allows you to use the phrases "XML data object" and > an "XML document" interchangeably. > > This isn't incongruous with stream markup - you just need to > consider the stream as an XML document. Seriously though, you > probably wouldn't have the same concerns about "XML data object" ... My concerns would be even greater. This conjures up in my mind a Java or C++ object where the complete stream has to be loaded in memory (or some other random-access medium) before it can be used. Yes, I know you can have a multi-threaded implementation so that you can start using the data object before it finishes reading, etc. However, given the object model it is *reasonable* for the niave programmer to ask for something at the _end_ of the stream. This will cause the call to block untill the stream ends. If "data stream" processing was treated with *equal* importance by the W3C committees, then they would see, in many cases, that this complementary approach is at least as good as, or in some cases far superior to an "data object" approach. Constantly viewing XML as a standard for the description of "data objects" and not "data streams" is a subtle, and important bias. It is taking object-orientation too far and discarding parallel stream processing, and it's related technologies like SAX and SAXON. :) Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Fri Feb 19 05:55:27 1999 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:09:12 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 References: <36CCD5C1.33353EA6@manhattanproject.com> <36CCE4D6.BAD5A8F5@allette.com.au> <36CCF1EB.CD64B705@manhattanproject.com> Message-ID: <36CCFCB0.17808188@allette.com.au> Clark Evans wrote: > My concerns would be even greater. This conjures up in my > mind a Java or C++ object where the complete stream > has to be loaded in memory (or some other random-access > medium) before it can be used. Parser theory isn't my bag so I may be wildly off base, but aren't many XML processors going to require say, a well-formedness check of the whole document before allowing a call? > If "data stream" processing was treated with *equal* > importance by the W3C committees, then they would see, > in many cases, that this complementary approach is at > least as good as, or in some cases far superior to > an "data object" approach. You would almost certainly know better than I, but it seems that the phrase "XML document" can still be appropriate for both streams and objects, as long as you consider it to be appropriate. (Otherwise, it is totally inappropriate.) Calling an "XML document" a "data stream" doesn't effect the efficiency with which you can access it. Also, your distinction may well be valid, but I don't believe it would be practical to employ terminology that differs philosophically from that in the XML recommendation. -- Regards, Marcus Carr email: mrc@allette.com.au ___________________________________________________________________ Allette Systems (Australia) www: http://www.allette.com.au ___________________________________________________________________ "Everything should be made as simple as possible, but not simpler." - Einstein xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Feb 19 07:11:46 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:12 2004 Subject: FW: XML Mail In-Reply-To: <000101be5bc3$e9e1c3f0$d3228018@jabr.ne.mediaone.net> from "Borden, Jonathan" at Feb 18, 99 11:54:21 pm Message-ID: <199902190805.DAA22416@locke.ccil.org> Borden, Jonathan scripsit: > XMTP is an XML <-> MIME mapping which transforms SMTP messages into XML. > The mapping is described at http://jabr.ne.mediaone.net/documents/xmtp.htm Very interesting stuff. However, the DTD will not do, and indeed this is an application that *shouldn't* really have a DTD, IMHO. Your DTD treats "ANY" as a general-purpose wildcard in content models, for example: . That won't work. ANY is treated as a wildcard only when it forms the entire content model, e.g. . The use you make of it is syntactically correct, but ANY will be interpreted as an element name, not what you want! In any event, your application is not suitable for a DTD, for this reason: Your converter generates an element type for every different kind of SMTP header line. Since the set of header lines is unbounded, so is the set of element types, and DTDs cannot cope with that situation even if ANY is used, because ANY only allows any *declared* element type to appear in the content model. So don't use a DTD and stick to a prose explanation. > Valid HTML can't be left alone because this is not required to be well > formed XML hence HTML is escaped. XML is left alone if the > content-type=text/xml A very sound point. However, it might be neater to use a CDATA section instead of methodical & and < escaping, as someone else mentioned. There is certainly no reason to do ' and "e; escaping in character data: that is only necessary in attributes, which you don't use. -- John Cowan cowan@ccil.org e'osai ko sarji la lojban. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Fri Feb 19 12:05:11 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:09:12 2004 Subject: FW: XML Mail In-Reply-To: <199902190805.DAA22416@locke.ccil.org> Message-ID: <000301be5bff$5e9ba7c0$d3228018@jabr.ne.mediaone.net> John Cowan wrote: > > > Borden, Jonathan scripsit: > > > XMTP is an XML <-> MIME mapping which transforms SMTP > messages into XML. > > The mapping is described at > http://jabr.ne.mediaone.net/documents/xmtp.htm > > Very interesting stuff. However, the DTD will not do, and indeed this > is an application that *shouldn't* really have a DTD, IMHO. > > Your DTD treats "ANY" as a general-purpose wildcard in content models, > for example: . That won't work. > ANY is treated as a wildcard only when it forms the entire content model, > e.g. . The use you make of it is syntactically > correct, but ANY will be interpreted as an element name, not what you > want! The DTD is now history. There is also no reason not to CDATA all non-XML bodies instead of escaping. Thanks for the feedback. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Feb 19 13:30:49 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:12 2004 Subject: FW: XML Mail In-Reply-To: <000301be5bff$5e9ba7c0$d3228018@jabr.ne.mediaone.net> References: <199902190805.DAA22416@locke.ccil.org> <000301be5bff$5e9ba7c0$d3228018@jabr.ne.mediaone.net> Message-ID: <14029.26279.282943.89816@localhost.localdomain> Borden, Jonathan writes: > The DTD is now history. There is also no reason not to CDATA all > non-XML bodies instead of escaping. What if the non-XML body contains the sequence "]]>"? All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Fri Feb 19 13:51:23 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:09:12 2004 Subject: FW: XML Mail Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A1621@eukbant101.ericsson.se> > -----Original Message----- > From: David Megginson [SMTP:david@megginson.com] > > Borden, Jonathan writes: > > > The DTD is now history. There is also no reason not to CDATA all > > non-XML bodies instead of escaping. > > What if the non-XML body contains the sequence "]]>"? > > $body =~ s/\]\]>/]]>]]>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Feb 19 18:10:46 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:12 2004 Subject: FW: XML Mail References: <199902190805.DAA22416@locke.ccil.org> <000301be5bff$5e9ba7c0$d3228018@jabr.ne.mediaone.net> <14029.26279.282943.89816@localhost.localdomain> Message-ID: <36CDA90F.4D363359@locke.ccil.org> David Megginson wrote: > What if the non-XML body contains the sequence "]]>"? Map it to "]]]]>". Ugly but it works. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ken at bitsko.slc.ut.us Fri Feb 19 18:23:15 1999 From: ken at bitsko.slc.ut.us (Ken MacLeod) Date: Mon Jun 7 17:09:12 2004 Subject: Announcement: XML Infoset Requirements Published In-Reply-To: David Megginson's message of Thu, 18 Feb 1999 20:12:17 -0500 (EST) References: <14028.47599.269952.835288@localhost.localdomain> Message-ID: David Megginson writes: > I am happy to announce that the W3C's XML Information Set Working > Group has published its Requirements and Design Principles document at > the following location: > > http://www.w3.org/TR/NOTE-xml-infoset-req > > As specified in the document, please send comments to > www-xml-infoset-comments@w3.org, which is a publicly-archived mailing > list. It seems that ``Information Sets'' perform a similar role to SGML and HyTime's ``Property Sets''? Since I'm sure the WG is aware of that, is there an intentional distinction being made by using a different name? or is it just a more generic term? -- Ken MacLeod ken@bitsko.slc.ut.us xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rhanson at blast.net Fri Feb 19 18:26:06 1999 From: rhanson at blast.net (Robert Hanson) Date: Mon Jun 7 17:09:12 2004 Subject: FW: XML Mail Message-ID: <00f301be5c35$20627ca0$0fb919ce@Bertha> Am I wrong in saying that the answer would be that the symbol ">" should not show up in the non-XML text anyway? Wouldn't it be converted to an entity reference? Like this... "]]>" in the text body would become "]]>" Robert Hanson ----- Original Message ----- From: John Cowan To: XML Dev Sent: Friday, February 19, 1999 1:10 PM Subject: Re: FW: XML Mail >David Megginson wrote: > >> What if the non-XML body contains the sequence "]]>"? > >Map it to "]]]]>". Ugly but it works. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Fri Feb 19 18:44:07 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:09:12 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 In-Reply-To: <36CCF1EB.CD64B705@manhattanproject.com> Message-ID: <000201be5c37$806df8c0$5118a8c0@kuantech1.quokka.com> I completely agree with Clark. As someone working with real-time XML streams, I think this is very important. In particular, the whole notion of "document" needs to be thought through very carefully in the context of 1999, rather than the context of 1990 when SGML was developed. If I may grow philosophical for a moment, I believe that XML is at a crossroads. That crossroads can be defined by examining the term "markup". I believe that XML is actually moving away from being "markup" oriented. First of all, one can easily imagine an XML document where all leaf-level elements are EMPTY, and contain all their semantics within attributes. In that case, there is nothing to be "marked up". Furthermore, when you apply XML to things like database record interchange, it really isn't a text-oriented environment anymore. I believe that XML points more towards type systems than markup. If you look at a programming language, it generally supports 2 things (I am being very poetic and not rigorous here): defining and instantiating data types, and defining and instantiating operations on data. XML supports the first. It provides a mechanism to create and exchange instances of data types between external systems that will provide the operations on those data. The realization that DTD's are inadequate, and that a more robust schema specification language is needed, points in the same direction. If you approach XML as a type system, the concept of document loses its first-class status (or at least should, in my opinion). It is interesting that the concept of document (even physical document as file) has crept into programming languages, and has caused problems there as well. The C language include directive is a physical rather than a logical mechanism. When you try to build a database-driven incremental build system, includes become problematic. I would like to encourage the XML community to 1) pay attention to the lessons of 30 years of development in the arena programming and type languages, and 2) not get bogged down by the historical baggage of the M in XML. Jeff Sussna -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of Clark Evans Sent: Thursday, February 18, 1999 9:09 PM To: Marcus Carr Cc: xml-dev@ic.ac.uk Subject: Re: XML Information Set Requirements, W3C Note 18-February-1999 Marcus Carr wrote: > I think you might be applying a meaning to that > phrase that it doesn't deserve - it doesn't call XML > a document standard, it uses the term "XML document", > with document defined in the XML recommendation as: > > "A data object is an XML document if it is well-formed, > as defined in this specification. A well-formed XML document > may in addition be valid if it meets certain further constraints." > > This allows you to use the phrases "XML data object" and > an "XML document" interchangeably. > > This isn't incongruous with stream markup - you just need to > consider the stream as an XML document. Seriously though, you > probably wouldn't have the same concerns about "XML data object" ... My concerns would be even greater. This conjures up in my mind a Java or C++ object where the complete stream has to be loaded in memory (or some other random-access medium) before it can be used. Yes, I know you can have a multi-threaded implementation so that you can start using the data object before it finishes reading, etc. However, given the object model it is *reasonable* for the niave programmer to ask for something at the _end_ of the stream. This will cause the call to block untill the stream ends. If "data stream" processing was treated with *equal* importance by the W3C committees, then they would see, in many cases, that this complementary approach is at least as good as, or in some cases far superior to an "data object" approach. Constantly viewing XML as a standard for the description of "data objects" and not "data streams" is a subtle, and important bias. It is taking object-orientation too far and discarding parallel stream processing, and it's related technologies like SAX and SAXON. :) Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dcbiker at goldray.com Fri Feb 19 19:04:19 1999 From: dcbiker at goldray.com (Harold Goldstein) Date: Mon Jun 7 17:09:13 2004 Subject: xsl tutorials Message-ID: <4.1.19990219140707.00995e70@mail.i95.net> can anyone recommend an XSL tutorial that reflects the state of XSL as implemented in IE5? ------------------------------------------------------------ Harold Goldstein - dcbiker@goldray.com ~~ ,__o ~^^ See the Goldpages: http://goldray.com/ ~_-\ < , ~ o \ Web Development/Internet Training (*)/ (*) ~ / { \ Save The Apes: http://biosynergy.org/bushmeat/ / { o} BIGFACE!?#%! http://bigface.com/ -- you will be released xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Feb 19 19:24:48 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:13 2004 Subject: FW: XML Mail References: <00f301be5c35$20627ca0$0fb919ce@Bertha> Message-ID: <36CDBA46.B13FB101@locke.ccil.org> Robert Hanson wrote: > Am I wrong in saying that the answer would be that the symbol ">" should not > show up in the non-XML text anyway? Wouldn't it be converted to an entity > reference? Like this... > > "]]>" in the text body would become "]]>" The point was that conversions like ">" becomes ">" aren't necessary if you use CDATA sections, only the one bad case of "]]>" which is illegal in plain XML character data as well as terminating the CDATA section, and so must become something like "]]]]>" to end the CDATA section plus "". There are other ways to do it, of course. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Fri Feb 19 19:28:00 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:13 2004 Subject: XML Information Set Requirements and XP In-Reply-To: <000201be5c37$806df8c0$5118a8c0@kuantech1.quokka.com> References: <36CCF1EB.CD64B705@manhattanproject.com> Message-ID: <199902191926.OAA02471@hesketh.net> At 10:41 AM 2/19/99 -0800, Jeffrey E. Sussna wrote: >I completely agree with Clark. As someone working with real-time XML >streams, I think this is very important. In particular, the whole >notion of "document" needs to be thought through very carefully in >the context of 1999, rather than the context of 1990 when SGML was >developed. One sign of how much the context has changed is this proposal for using XML as the foundation for a network protocol: XP, the Extensible Protocol. http://www.ietf.org/internet-drafts/draft-harding-extensible-protocol-00.txt Other proposals, like XML-RPC, have gone similar places, but this is really _wide_ open. Interesting food for thought. Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Orr at Design-Intelligence.com Fri Feb 19 19:34:21 1999 From: Michael.Orr at Design-Intelligence.com (Michael.Orr@Design-Intelligence.com) Date: Mon Jun 7 17:09:13 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 Message-ID: > -----Original Message----- > From: Clark Evans [mailto:clark.evans@manhattanproject.com] > Sent: Thursday, February 18, 1999 9:09 PM > Subject: Re: XML Information Set Requirements, W3C Note 18-February-1999 > (snip) > Constantly viewing XML as a standard for the description > of "data objects" and not "data streams" is a subtle, and > important bias. It is taking object-orientation too far > and discarding parallel stream processing, and it's related > technologies like SAX and SAXON. I'd like to understand this better... Naive question 1: Ignoring the realities of engaging with the recommendation process, just focusing on understanding the nature of your concerns, what types of requirements, modes of thought, etc. would become more prominent if the objects-over-streams bias were suddenly removed? For instance, is one aspect of your thinking a desire to chunk an "XML stream" so that validation can be performed incrementally, while still wishing the XML rec to govern the overall well-formedness (?) of the stream as a whole? Or is this totally off base? I'm groping to understand the point of view you're promoting here... Naive question 2: Is there anything to be learned from the DTDs that have been defined to date for applications that raise stream-ish issues to one extent or another? I'm thinking of things like ICE for content syndication, CDF for push channels, CBL and cXML for e-commerce, etc. Not trying to make any particular point here (and I don't know these specs in depth) -- it would just seem they must have engaged some of the questions you're raising and might therefore throw light on the capabilities, drawbacks, and coping paradigms available with current approaches. Just looking for a better understanding of the issues, Mike ---------------------------------------- Michael Orr, CTO, VP R&D Design Intelligence Inc, Seattle WA USA http://www.design-intelligence.com pager:888-688-4609 fax:206-343-7750 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Feb 19 19:36:50 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:13 2004 Subject: Announcement: XML Infoset Requirements Published References: <14028.47599.269952.835288@localhost.localdomain> Message-ID: <36CDBC9A.26665AD5@locke.ccil.org> Ken MacLeod wrote: > It seems that ``Information Sets'' perform a similar role to SGML and > HyTime's ``Property Sets''? Since I'm sure the WG is aware of that, > is there an intentional distinction being made by using a different > name? or is it just a more generic term? "Property set" has specific baggage; "information set" does not. But conceptually they are meant to be similar. Speaking for myself, not the Infoset WG -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Feb 19 19:43:34 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:13 2004 Subject: ANNOUNCE: IBTWSH draft 5 Message-ID: <36CDBE5F.677FD516@locke.ccil.org> Draft 5 of the Itsy Bitsy Teeny Weeny Simple Hypertext DTD is now out at http://www.ccil.org/~cowan/XML/ibtwsh.dtd . There are two main new features: 1) The references to the XML-ized entity sets are no longer commented out, so you'll need them to do validation against IBTWSH. You can retrieve them at: http://www.ccil.org/~cowan/XML/XMLlat1.ent http://www.ccil.org/~cowan/XML/XMLsymbol.ent http://www.ccil.org/~cowan/XML/XMLspecial.ent 2) I have added the elements HTML, HEAD, TITLE, STYLE, and BODY, so that IBTWSH can be used to describe complete documents as well as document parts. Appropriate content models and attribute lists are supplied. If HTML is used, then HEAD, TITLE, and BODY are mandatory. You can suppress the declaration of these elements by setting a parameter entity, for backward compatibility. Note: There is an incompatibility between IBTWSH documents and HTML 4.0 documents, in that under IBTWSH, but not HTML, character data can appear directly within the BODY element without an intervening block element. This may be fixed eventually. Most browsers (as opposed to validators) don't care. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Feb 19 19:52:53 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:13 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 References: <000201be5c37$806df8c0$5118a8c0@kuantech1.quokka.com> Message-ID: <36CDC0F3.58C4D4AE@locke.ccil.org> Jeffrey E. Sussna wrote: > I would like to encourage the XML community to [...] not get bogged > down by the historical baggage of the M in XML. You mean "Mosher"? :-) (For the historically challenged, the "GML" in "SGML" originally stood for Goldfarb, Mosher, and Lorie.) Actually, the "data" vs. "document" dichotomy has been there since the beginning. Goldfarb himself has said: # [M]y chief motivation was information retrieval, not # typesetting. On a more serious note, however, the notion of an "information set" has to do with what information is present in the document/data object, rather than how it is retrieved. The purpose of the Infoset WG is to define what an XML processor must, should, and may tell an application, not to specify the means it uses for doing so. A tree-based API like DOM informs its client of the child->parent relationship by an explicit API call. A stream-based interface like SAX does the same implicitly by the order of "startElement" and "endElement" events. Both provide the information. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Fri Feb 19 19:59:55 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:09:13 2004 Subject: Fw: XML Information Set Requirements, W3C Note 18-February-1999 Message-ID: <03ab01be5c41$b66478a0$5402a8c0@oren.capella.co.il> Michael.Orr@Design-Intelligence.com wrote: >I'd like to understand this better... > >Naive question 1: Ignoring the realities of engaging with the >recommendation process, just focusing on understanding the nature of >your concerns, what types of requirements, modes of thought, etc. would >become more prominent if the objects-over-streams bias were suddenly >removed? Butting into the conversation :-) I can give one example. XSL is defined so that a reasonable implementation needs to store the input document in an "object" - some random access structure, while the output can be emitted as a stream. I don't know if the designers thought of it quite this way, of course. A different set of design choices would have led to an opposite implementation - handling the input document as a stream. This would require that parts of the output document be built as "objects", but not necessary all of it. Given all the talk about the relationship between XSL and XQL, and the real possibility of using an XSL stylesheet to extract a small amount of information from a large (virtual?) input document, this might prove to be an expensive design decision. Of course, it has other merits - XSL stylesheets are somewhat clearer as a result. It would be interesting to see how XQL would address this issue. Share & Enjoy, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Fri Feb 19 20:09:57 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:09:13 2004 Subject: Confusion about conditional sections Message-ID: <8725671D.006EA1D6.00@d53mta03h.boulder.ibm.com> Ok, I'm confused about a conditional section issue that some of my colleagues were discussing... Once you see a conditional ignore section, can you effectively just scan for parts of the text inside there without actually doing regular parsing? Is there a reason that this cannot be done? The logic is basically this (and assumes that we've already entered the body of an ignore section): while (true) { depth = 0; if (skipped char '<') { if (skipped char '!') and (skipped char '[') depth++; } else if (next char is '>') { if (skipped char ']') and (skipped char '>') depth--; if (!depth) return; } else if (skipped char is not valid XML char) { emit error } } Here 'skipped char' means that it was skipped over in the content if it was the target character. I can't help but think that this logic would fail to deal with a number of issues, but I can't think of any right off hand. What is missing from this picture? Also, does the specification of a conditional section basically imply that you cannot have a ']]>' character anywhere in an ignored section, even if its in a literal? So something like this: text of my entity"> ]]> would fail according to the spec because the ]]> character is not allowed inside an ignored conditional section, even if in a place where it is otherwise legal such as in a literal value. Is this correct? The above logic is kind of dependent upon this being true I would think, since otherwise it could be fooled. If this is true it would seem to be awfully wierd that changing INCLUDE to IGNORE would cause a correct document to break in this way. The spec says that you must parse even the ignored section, but it doesn't say to what extent. The logic above does 'parse' the text in that it looks at every character in there. But its attempting to do a very low calory and fast parse based on knowledge of what can be in a conditional section. Since there is no identifying name in the end of a conditional, to assure that its correctly aligned, doesn't the above logic correctly maintain all the required state? It would though seem not to catch something like this: ]]> Since it actually does not look at what follows the Message-ID: <36CDCFB5.84A650E@eng.sun.com> roddey@us.ibm.com wrote: > > Ok, I'm confused about a conditional section issue that some of my > colleagues were discussing... > > Once you see a conditional ignore section, can you effectively just scan > for parts of the text inside there without actually doing > regular parsing? Is there a reason that this cannot be done? I'd say you MUST do this. Consider this sort of structure, which I found in a version of the XML Docbook DTD: ]]> ]]> That is, if you tried to parse the contents of the second conditional section and something like the first one hadn't been parsed, you would be in undeserved trouble. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nate at valleytel.net Fri Feb 19 21:12:32 1999 From: nate at valleytel.net (Nathan Kurz) Date: Mon Jun 7 17:09:13 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 In-Reply-To: from "Michael.Orr@Design-Intelligence.com" at Feb 19, 1999 11:33:19 AM Message-ID: <199902192112.PAA00726@trinkpad.valleytel.net> > Clark Evans writes: > > Constantly viewing XML as a standard for the description > > of "data objects" and not "data streams" is a subtle, and > > important bias. It is taking object-orientation too far > > and discarding parallel stream processing, and it's related > > technologies like SAX and SAXON. Jumping in on Clark's side, there does seem to be a strong bias for using XML solely as a source for recreating pre-existing "data objects". This is a fine use, but seems unnecessarily limiting. Michael Orr writes: > For instance, is one aspect of your thinking a desire to chunk an > "XML stream" so that validation can be performed incrementally, > while still wishing the XML rec to govern the overall > well-formedness (?) of the stream as a whole? Or is this totally off > base? I'm groping to understand the point of view you're promoting > here... To me at least, validation isn't that important. So long as the parts I choose to parse are well-formed, I don't care if the whole stream is well-formed. And if the stream is continuous (for example, an XML stock ticker) even the concept of a well-formed stream seems tenuous. Objects seem to carry with them a requirement for full and faithful representation. Your first task is always to reconstitute the entire 'document' object to its pre-XML state (ie, parse the entire document and build a complete object model). This doesn't need to be the case, of course, but it seems to be the default. With a stream, one feels more free to pick and choose, keeping only the parts that are relevant to the task at hand. And better, you can work with those parts as soon as you have them, as there is often no need to wait for an entire object to be created. For some reason, working with a half-parsed object seems wrong. Working with half a stream seems perfectly natural. All semantics, I suppose, but important ones. --nate Nathan Kurz nate@valleytel.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Feb 19 21:53:03 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:13 2004 Subject: Confusion about conditional sections References: <8725671D.006EA1D6.00@d53mta03h.boulder.ibm.com> Message-ID: <36CDDD26.77529605@locke.ccil.org> roddey@us.ibm.com wrote: > Once you see a conditional ignore section, can you effectively just scan > for parts of the text inside there without actually doing > regular parsing? Is there a reason that this cannot be done? That's definitely what the spec says, based on productions 63-65. Whether it *should* say that is a question. > Also, does the specification of a conditional section basically imply that > you cannot have a ']]>' character [sequence] anywhere in an ignored section, > even if it's in a literal? > > So something like this: > > text of my entity"> > ]]> So it seems. Such an entity, BTW, can't be referenced from character data, since "]]>" is illegal there (production 14), but can be referenced from attribute values. As a result, we now have the idiosyncratic property that MyEntity can be declared as above outside a conditional section, or in an INCLUDE conditional section, whereas the declaration cannot appear as above inside an IGNORE conditional section. This point seems to me to belong to the XML Syntax WG, so I have copied this response to xml-editor@w3.org. > Since it actually does not look at what follows the theoretically supposed to either be INCLUDE or IGNORE or CDATA, right? CDATA is impossible within the DTD, and INCLUDE/IGNORE are impossible outside the DTD, so one cannot be inside the other. It's interesting though that you can "comment out" non-XML declarations, or indeed almost any sort of junk, using an IGNORE section: random floating text even looks like instance text ]]> -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Feb 19 22:07:08 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:13 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 References: <199902192112.PAA00726@trinkpad.valleytel.net> Message-ID: <36CDE070.3EB1A411@locke.ccil.org> Nathan Kurz wrote: > And if the stream is continuous (for example, an XML > stock ticker) even the concept of a well-formed stream seems tenuous. It's not clear that XML supports infinitely long streams (where the end-tag of the document element is *never* reached). -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Fri Feb 19 22:29:59 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:09:13 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 In-Reply-To: <36CDE070.3EB1A411@locke.ccil.org> Message-ID: <000d01be5c57$101b7e60$5118a8c0@kuantech1.quokka.com> I think this issue needs to be addressed. It may be the case that the stream contains, not one, but many documents, where each information "packet" is a document. Or perhaps the afore-mentioned notion of "document fragment" is introduced, and each packet is a fragment. But it will definitely be necessary to support this kind of processing in some manner. -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of John Cowan Sent: Friday, February 19, 1999 2:07 PM To: XML Dev Subject: Re: XML Information Set Requirements, W3C Note 18-February-1999 Nathan Kurz wrote: > And if the stream is continuous (for example, an XML > stock ticker) even the concept of a well-formed stream seems tenuous. It's not clear that XML supports infinitely long streams (where the end-tag of the document element is *never* reached). -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Fri Feb 19 22:40:55 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:09:13 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 References: <000d01be5c57$101b7e60$5118a8c0@kuantech1.quokka.com> Message-ID: <36CDE60A.962958B@prescod.net> "Jeffrey E. Sussna" wrote: > > I think this issue needs to be addressed. It may be the case that > the stream contains, not one, but many documents, where each > information "packet" is a document. Or perhaps the afore-mentioned > notion of "document fragment" is introduced, and each packet is a > fragment. But it will definitely be necessary to support this kind of > processing in some manner. The role of the information set requirements committee is to make explicit the implicit data model described in vague terms in the XML specification. Their responsibility is not to point XML in a bold new direction. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Intel has a big bull's-eye on its forehead because everyone is gunning for it. But they have to be as nimble and aggressive as they were when they were a small company," Mr. Howe said. "It's easy to be nimble when you're a $5-billion company, it's a whole other thing when you're a $40-billion company." - http://www.globeandmail.ca/gam/ROB/19990217/RPENT.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Fri Feb 19 22:49:14 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:09:14 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 In-Reply-To: <36CDE60A.962958B@prescod.net> Message-ID: <000e01be5c59$b87877a0$5118a8c0@kuantech1.quokka.com> That's fine. But it may be the case that the process of evaluating the information set requirements has uncovered an area where XML, either as a syntax or as an information set, is not adequate to handle a reasonable usage of it. Jeff -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of Paul Prescod Sent: Friday, February 19, 1999 2:31 PM To: xml-dev Subject: Re: XML Information Set Requirements, W3C Note 18-February-1999 "Jeffrey E. Sussna" wrote: > > I think this issue needs to be addressed. It may be the case that > the stream contains, not one, but many documents, where each > information "packet" is a document. Or perhaps the afore-mentioned > notion of "document fragment" is introduced, and each packet is a > fragment. But it will definitely be necessary to support this kind of > processing in some manner. The role of the information set requirements committee is to make explicit the implicit data model described in vague terms in the XML specification. Their responsibility is not to point XML in a bold new direction. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Intel has a big bull's-eye on its forehead because everyone is gunning for it. But they have to be as nimble and aggressive as they were when they were a small company," Mr. Howe said. "It's easy to be nimble when you're a $5-billion company, it's a whole other thing when you're a $40-billion company." - http://www.globeandmail.ca/gam/ROB/19990217/RPENT.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rschoening at unforgettable.com Fri Feb 19 22:57:07 1999 From: rschoening at unforgettable.com (Rob Schoening) Date: Mon Jun 7 17:09:14 2004 Subject: From XML to Latin Was: XML Information Set Requirements, W3C Note 18-February-1999 In-Reply-To: <000d01be5c57$101b7e60$5118a8c0@kuantech1.quokka.com> References: <000d01be5c57$101b7e60$5118a8c0@kuantech1.quokka.com> Message-ID: <0003443fa22f31dc_mailit@mail.iname.com> >I think this issue needs to be addressed. It may be the case that the stream >contains, not one, but many documents, where each information "packet" is a >document. Or perhaps the afore-mentioned notion of "document fragment" is >introduced, and each packet is a fragment. But it will definitely be >necessary to support this kind of processing in some manner. It definitely needs to be addressed. XML would do well to take a long hard look at natural language. That is, what are the aspects of natural langauge that allow us to deal with them in both spoken and written form in a variety of media (on paper, on screen, on tape, on air, etc...)? As Jeffrey suggested, we're getting stuck on the markup when markup is only a small part of the picture. If that happens we're going to end up with Latin! It is formal and precise, but not so terribly useful for communication. These days that's a crippling liability. Does the W3C have a set of use-cases for XML. I'd be very interested to see them. Rob > >-----Original Message----- >From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of >John Cowan >Sent: Friday, February 19, 1999 2:07 PM >To: XML Dev >Subject: Re: XML Information Set Requirements, W3C Note 18-February-1999 > > >Nathan Kurz wrote: > >> And if the stream is continuous (for example, an XML >> stock ticker) even the concept of a well-formed stream seems tenuous. > >It's not clear that XML supports infinitely long streams (where the >end-tag of the document element is *never* reached). > >-- >John Cowan http://www.ccil.org/~cowan cowan@ccil.org > You tollerday donsk? N. You tolkatiff scowegian? Nn. > You spigotty anglease? Nnn. You phonio saxo? Nnnn. > Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN >981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN >981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wunder at infoseek.com Fri Feb 19 23:08:28 1999 From: wunder at infoseek.com (Walter Underwood) Date: Mon Jun 7 17:09:14 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 In-Reply-To: <000d01be5c57$101b7e60$5118a8c0@kuantech1.quokka.com> References: <36CDE070.3EB1A411@locke.ccil.org> Message-ID: <3.0.5.32.19990219145833.00c214e0@corp> At 02:27 PM 2/19/99 -0800, Jeffrey E. Sussna wrote: >I think this issue needs to be addressed. It may be the case that >the stream contains, not one, but many documents, where each >information "packet" is a document. Or perhaps the afore-mentioned >notion of "document fragment" is introduced, and each packet is a >fragment. But it will definitely be necessary to support this kind >of processing in some manner. "will definitely"? Any stream has to start, and ought to be able to end gracefully. Why not start and stop once in a while for resyncronization? A series of document-packets (packuments?) should work fine. On the other hand, I'm not sure that XML must be used to solve every problem. It's beyond me what we gain by incompatibly re-implementing RPC, SQL, Java class files, or whatever in XML. Let's solve some new problems. If we don't let go of the XML hammer once in a while, everything starts to look like a thumb. wunder -- Walter R. Underwood wunder@infoseek.com wunder@best.com (home) http://software.infoseek.com/cce/ (my product) http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Fri Feb 19 23:16:38 1999 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:09:14 2004 Subject: Exceptions or not (Was: RE: ModSAX (SAX 1.1) Proposal) References: <01BE5A9B.2D376B70.jarle.stabell@dokpro.uio.no> Message-ID: <36CDF078.8C287EE6@Eng.Sun.COM> Jarle Stabell wrote: > > > True. I love exceptions and find that they greatly improves the robustness > of applications, but the reason I'm not convinced about whether it is good > to be forced to specify what will be thrown is that this in many cases seem > to require psychic powers of the designer Nah, just a moderately mature design, proven in some real systems. I use the rule of thumb that three different (!) layers must use an interface before it's realistic to call it "stable". > (or that the "real" exceptions > must be catched and converted into an "acceptable" one, which looses > information). Converting to an "acceptable" one can encapsulate: SAXException does this, as does InvocationTargetException. A stack of "this error caused that one caused that one ..." is often much more helpful when diagnosing a problem than an root cause. Converting often actually _adds_ information ... like why the error couldn't be recovered. Keep in mind that a normal behavior for exceptions is catching and recovering! - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Fri Feb 19 23:36:04 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:09:14 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 In-Reply-To: <3.0.5.32.19990219145833.00c214e0@corp> Message-ID: <000f01be5c60$482a4c60$5118a8c0@kuantech1.quokka.com> Sorry; I didn't mean to say it would definitely be necessary to support "endless" streams, just streams. Why is this relevant to Information Set Requirements? Perhaps it's not. But when you raise the level of abstraction, you open the door to asking questions such as "What is a document?" Given that the requirements in question tightly tie themselves to XML 1.0, it may all be moot, and this may all just be grist for the proverbial XML 2.0 mill. And by the way, I agree that XML shouldn't be seen as the solution to all problems. There are, however, a large set of problems to which it naturally wants to lend itself. In any case, let me give an example of how one might want to apply XML to streaming data. Imagine the following XML streaming data "packet": 25 N You can imagine an application that receives and processes these packets one at a time. At the same time, however, the application spools the packets to a file for offline analysis: ... ... ... So, what's the document? Now, given the emphasis on "well-formed", by definition a sub-element in a well-formed XML document is itself well-formed, so maybe you can cheat. Jeff -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of Walter Underwood Sent: Friday, February 19, 1999 2:59 PM To: Jeffrey E. Sussna; 'John Cowan'; 'XML Dev' Subject: RE: XML Information Set Requirements, W3C Note 18-February-1999 At 02:27 PM 2/19/99 -0800, Jeffrey E. Sussna wrote: >I think this issue needs to be addressed. It may be the case that >the stream contains, not one, but many documents, where each >information "packet" is a document. Or perhaps the afore-mentioned >notion of "document fragment" is introduced, and each packet is a >fragment. But it will definitely be necessary to support this kind >of processing in some manner. "will definitely"? Any stream has to start, and ought to be able to end gracefully. Why not start and stop once in a while for resyncronization? A series of document-packets (packuments?) should work fine. On the other hand, I'm not sure that XML must be used to solve every problem. It's beyond me what we gain by incompatibly re-implementing RPC, SQL, Java class files, or whatever in XML. Let's solve some new problems. If we don't let go of the XML hammer once in a while, everything starts to look like a thumb. wunder -- Walter R. Underwood wunder@infoseek.com wunder@best.com (home) http://software.infoseek.com/cce/ (my product) http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Orr at Design-Intelligence.com Sat Feb 20 00:02:01 1999 From: Michael.Orr at Design-Intelligence.com (Michael.Orr@Design-Intelligence.com) Date: Mon Jun 7 17:09:14 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 Message-ID: > -----Original Message----- > From: Nathan Kurz [mailto:nate@valleytel.net] > Sent: Friday, February 19, 1999 1:13 PM > Subject: Re: XML Information Set Requirements, W3C Note 18-February-1999 > (snip) > > Michael Orr writes: > > For instance, is one aspect of your thinking a desire to chunk an > > "XML stream" so that validation can be performed incrementally, > > while still wishing the XML rec to govern the overall > > well-formedness (?) of the stream as a whole? Or is this totally off > > base? I'm groping to understand the point of view you're promoting > > here... > > To me at least, validation isn't that important. So long as the parts > I choose to parse are well-formed, I don't care if the whole stream is > well-formed. And if the stream is continuous (for example, an XML > stock ticker) even the concept of a well-formed stream seems tenuous. If you're not interested in the whole stream (or bigger than parse-sized chunks) being well-formed or valid, then by definition you don't have a document-too-big-to-construct-the-info-set problem. That's why Clark's question suggested to me an interest in scenarios where DTDs or schemas would be used to express the organization of the stream itself, whatever that might mean in an application context. Again, I'm not suggesting that the concerns re stream-oriented applications aren't real, I'm just trying to get a conceptual grip on them. Thanks, Mike ---------------------------------------- Michael Orr, CTO, VP R&D Design Intelligence Inc, Seattle WA USA http://www.design-intelligence.com pager:888-688-4609 fax:206-343-7750 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matt at veosystems.com Sat Feb 20 00:28:15 1999 From: matt at veosystems.com (matt@veosystems.com) Date: Mon Jun 7 17:09:14 2004 Subject: XML Product Manager opening at CommerceOne Message-ID: <19990220002758.30678.qmail@veosystems.com> A non-text attachment was scrubbed... Name: not available Type: text Size: 1410 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990220/b4ad5cfc/attachment.bat From colds at nwlink.com Sat Feb 20 01:31:17 1999 From: colds at nwlink.com (Chris Olds) Date: Mon Jun 7 17:09:14 2004 Subject: Confusion about conditional sections Message-ID: <012e01be5c70$8b17fc10$dc59fcc6@salsa.walldata.com> Section 3.4 says "If the keyword of the conditional section is IGNORE, then the contents of the conditional section are not logically part of the DTD." All productions 63-65 say is that all of the "" (sic) in an ignored section have to nest properly, and any other characters have to match 'char'. As to the entity problem, a general entity containing the string "]]>" is not well-formed (whether it is used or not), since 4.3.2 says "An internal general parsed entity is well-formed if its replacement text matches the production labeled content". Unfortunately, the next line says "All internal parameter entities are well-formed by definition.", but there is no production that actually defines this constraint. The result of this is that the following example appears to be legal if the value of '%IncludeMe;' is either 'INCLUDE' or 'IGNORE': "> ]]> Note that in place of GE definitions, I used PE definitions, which don't have to match 'content'. Whether or not you can use these entities as part of a DTD is unclear to me. I wouldn't advise it. Bottom line: productions 63-65 explicitly permit the scanning loop you showed. /cco xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Marc.McDonald at Design-Intelligence.com Sat Feb 20 02:02:42 1999 From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com) Date: Mon Jun 7 17:09:14 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 Message-ID: I agree with Sussna, by thinking out the concept of a document more fully a number of interesting ideas present themselves. In considering a document to be a stream or information set, it allows a distributive organization over a network. Instead of requiring the entire 'document' to be transferred en-masse as a file, it can be done piece-wise over a stream. Consider this just-in-time manufacturing of the 'document'. Naturally, you can think of cases where only part of the entire document is needed. Subsetting of the document tree is one of the features of XSL. Unifying these 2 ideas provides a new use for a DTD. It is not only a means to describe the valid structure of a document, but now can advertise the information available. A site can be described as capable of providing information sets in a set of structures defined by DTDs (or their replacement). A consuming application could request information by a pattern or query which would return the desired subset of information. What could be accomplished is a unified solution to problems addressed and/or recognized in SAX, XSL, queries, DOM, and fragments. It also provides a model for a data server as an XML 'document' constructor. In terms of architecture, it removes bottlenecks. Converting to a file model is expensive if the information is large and it can be used piecemeal on the other side. It is a worst-case solution. A demand-based stream model will create entire documents only if required by the ultimate consumer of the information and otherwise incrementally provide elements. Marc B McDonald, Principal Software Scientist Design Intelligence Inc, Seattle WA http://www.design-intelligence.com ---------- From: Jeffrey E. Sussna [SMTP:jes@kuantech.com] Sent: Friday, February 19, 1999 10:42 AM To: 'Clark Evans'; 'Marcus Carr' Cc: xml-dev@ic.ac.uk Subject: RE: XML Information Set Requirements, W3C Note 18-February-1999 I completely agree with Clark. As someone working with real-time XML streams, I think this is very important. In particular, the whole notion of "document" needs to be thought through very carefully in the context of 1999, rather than the context of 1990 when SGML was developed. If I may grow philosophical for a moment, I believe that XML is at a crossroads. That crossroads can be defined by examining the term "markup". I believe that XML is actually moving away from being "markup" oriented. First of all, one can easily imagine an XML document where all leaf-level elements are EMPTY, and contain all their semantics within attributes. In that case, there is nothing to be "marked up". Furthermore, when you apply XML to things like database record interchange, it really isn't a text-oriented environment anymore. I believe that XML points more towards type systems than markup. If you look at a programming language, it generally supports 2 things (I am being very poetic and not rigorous here): defining and instantiating data types, and defining and instantiating operations on data. XML supports the first. It provides a mechanism to create and exchange instances of data types between external systems that will provide the operations on those data. The realization that DTD's are inadequate, and that a more robust schema specification language is needed, points in the same direction. If you approach XML as a type system, the concept of document loses its first-class status (or at least should, in my opinion). It is interesting that the concept of document (even physical document as file) has crept into programming languages, and has caused problems there as well. The C language include directive is a physical rather than a logical mechanism. When you try to build a database-driven incremental build system, includes become problematic. I would like to encourage the XML community to 1) pay attention to the lessons of 30 years of development in the arena programming and type languages, and 2) not get bogged down by the historical baggage of the M in XML. Jeff Sussna -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of Clark Evans Sent: Thursday, February 18, 1999 9:09 PM To: Marcus Carr Cc: xml-dev@ic.ac.uk Subject: Re: XML Information Set Requirements, W3C Note 18-February-1999 Marcus Carr wrote: > I think you might be applying a meaning to that > phrase that it doesn't deserve - it doesn't call XML > a document standard, it uses the term "XML document", > with document defined in the XML recommendation as: > > "A data object is an XML document if it is well-formed, > as defined in this specification. A well-formed XML document > may in addition be valid if it meets certain further constraints." > > This allows you to use the phrases "XML data object" and > an "XML document" interchangeably. > > This isn't incongruous with stream markup - you just need to > consider the stream as an XML document. Seriously though, you > probably wouldn't have the same concerns about "XML data object" ... My concerns would be even greater. This conjures up in my mind a Java or C++ object where the complete stream has to be loaded in memory (or some other random-access medium) before it can be used. Yes, I know you can have a multi-threaded implementation so that you can start using the data object before it finishes reading, etc. However, given the object model it is *reasonable* for the niave programmer to ask for something at the _end_ of the stream. This will cause the call to block untill the stream ends. If "data stream" processing was treated with *equal* importance by the W3C committees, then they would see, in many cases, that this complementary approach is at least as good as, or in some cases far superior to an "data object" approach. Constantly viewing XML as a standard for the description of "data objects" and not "data streams" is a subtle, and important bias. It is taking object-orientation too far and discarding parallel stream processing, and it's related technologies like SAX and SAXON. :) Clark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sat Feb 20 07:48:49 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:09:14 2004 Subject: ModSAX (SAX 1.1) Proposal References: <3994C79D0211D211A99F00805FE6DEE249BF8D@exchny15.corp.smb.com> <14026.57951.26146.946129@localhost.localdomain> Message-ID: <36CCCEAC.612D7AD1@jclark.com> I don't see the point of this. It doesn't seem to buy me anything over what I can already do using normal programming language features: interface PingParser extends Parser { void setPingHandler(PingHandler handler); } interface PingHandler { void ping(); } void registerPingHandler(Parser parser, PingHandler handler) { try { ((PingParser)parser).setPingHandler(handler); } catch (ClassCastException e) { // ... } } I can do the same thing in COM with QueryInterface, or in C++ with RTTI. The ModHandler class seems particularily useless. It just creates a completely unnecessary dependency between handler classes and the SAX package. You could use Object just as well. None of this seems to solve the real problem which is actually defining the handlers that are needed to provide the functionality missing from SAX 1.0 (like the handlers for comments, namespaces etc that were in the previous draft). David Megginson wrote: > > Ingargiola, Tito writes: > > > Why is interface ModHandler empty? Presumably, (an implementation > > of) ModParser is going to need to call methods on its handlers as > > it goes about its business . Will it somehow know that for > > ModHandlers which implement, say, namespace processing, that it > > should call a particular method (I won't even attempt to suggest > > what that method might be :-)? > > Maybe it will help if I walk through a silly example. Here's the > interface: > > public interface PingHandler extends org.xml.sax.ModHandler > { > public abstract void ping (); > } > > Here's how I register it with a ModParser: > > try { > parser.setHandler("com.megginson.handlers.ping", pingHandler; > } catch (SAXException e) { > System.err.println("Parser does not support Ping handlers"); > } > > Here's part of my PingParser class: > > private PingHandler pingHandler; > > public void setHandler (String handlerID, ModHandler handler) > throws SAXNotSupportedException > { > if (handlerID.equals("com.megginson.handlers.ping")) { > pingHandler = (PingHandler)handler; > } else { > throw new SAXNotSupportedException("Unknown handler type: " > + handlerID); > } > } > > In other words, if the class recognises the handlerID, then it will > know how to cast it; if it does not, then it should throw an > exception. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Sat Feb 20 07:56:35 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:09:15 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 Message-ID: <014501be5ca6$9c02ac60$0300000a@othniel.cygnus.uwa.edu.au> -----Original Message----- From: Jeffrey E. Sussna >I would like to encourage the XML community to 1) pay attention to the lessons of 30 >years of development in the arena programming and type languages, and 2) not get >bogged down by the historical baggage of the M in XML. XML came about to solve issues in structured document interchange on the web. Perhaps largely because the language syntax was developed more quickly than linking or stylesheets, people started using XML for a broader range of problems. If XML meets the needs of people working on those broader range of problems, then they can go ahead and use XML, but I would be extremely disappointed (as I'm sure many people who contributed to the original development of XML would be) if XML shifted away from being a solution for structured document interchange. If XML "moves away from being markup oriented" as you suggest is already happening, then XML will no longer be a solution to the very problems it was designed to solve. Long live XML for publishing! James -- James Tauber / jtauber@jtauber.com / www.jtauber.com Associate Researcher, Electronic Commerce Network Curtin University of Technology, Perth, Western Australia Full-day XML Tutorial @ WWW8 : http://www8.org/ Maintainer of : www.xmlinfo.com, www.xmlsoftware.com and www.schema.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sat Feb 20 08:32:25 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:09:15 2004 Subject: Blockland (was Re: XML Information Set Requirements, W3C Note 18-February-1999) Message-ID: <003d01be5cab$d2ada170$4ef96d8c@NT.JELLIFFE.COM.AU> (responding to Marc McM and Jeff S.) From: Marc.McDonald@Design-Intelligence.com >In considering a document to be a stream or information set, it allows >a distributive organization over a network. Instead of requiring the >entire 'document' to be transferred en-masse as a file, it can be done >piece-wise over a stream. Consider this just-in-time manufacturing of >the 'document'. Err, isn't this what entities are about? An entity could be a file or a "piece" on a stream. I guess you mean entity "prefetch" or "fetch on declaration" rather than "fetch on reference" for external entities. The XML SIG briefly considered whether this should be part of XML (I remember raising it) and I think the consensus was that entity management was an implementation issue to be dealt with by PIs or some other layer; XML does not mandate a specific entity-fetch policy, though, certainly, the expectation is that entities are fetched on reference. Fetching on declaration would be a bad default policy, because many entities are just link ends, and will never be traversed. The Xlink "auto|user" option will bring a nice possibility for smarter entity fetching: I would like for XLink to also include a priority indicator (e.g., a number) on links to indicate the fetching priority. So a company logo can arrive first, the content second, and the advertising last, for example; or so that data is only fetched after the script to run the data has been fetched. > Naturally, you can think of cases where only part of the entire >document is needed. Subsetting of the document tree is one of the >features of XSL. XML deliberately culled three features from SGML to allow this: * a special kind of ENTITY attribute called CONREF; an element with that could either have content directly speficied, or it could point to some other entity. * a special kind of ENTITY attribute called SUBDOC, which meant that the entity referred to was a document with its own DTD and local ID namepace. * data attributes are attributes on entity declarations: you could use them, conceivably, to specify the prefetching attributes for the entity's resource. Actually, you could also use PIs for this, and even (yuck) special elements at the head of your document (to simulate the data attributes). HyTime and SMIL also could be used to support prefetching policy. >Unifying these 2 ideas provides a new use for a DTD. It is not only a >means to describe the valid structure of a document, but now can >advertise the information available. A site can be described as >capable of providing information sets in a set of structures defined >by DTDs (or their replacement). A consuming application could request >information by a pattern or query which would return the desired >subset of information. This is more like what RDF is attempting: to provide a way to describe a resource, so that applications can determine whether the schema being used is one that they understand. This is what para 2 of http://www.w3.org/TR/PR-rdf-syntax/#intro seems to suggest. >In terms of architecture, it removes bottlenecks. Converting to a file >model is expensive if the information is large and it can be used >piecemeal on the other side. It is a worst-case solution. A >demand-based stream model will create entire documents only if >required by the ultimate consumer of the information and otherwise >incrementally provide elements. I think in your mind is the idea that there is only one big fat document; hence "entire document". If elements are provided "incrementally", each of them are documents. Together they are used for a "publication", not for an "entire document". Jeff Sussna write: >If you approach XML as a type system, the concept of document loses >its first-class status (or at least should, in my opinion). XML is not a type system. A document is a graph of elements, data, comments and PIs with * an ID namespace * optionally some element type declarations * optionally some entity declarations and notation declarations * optionally namespace declarations which allow local type names to be qualified by a URI In other words, the document is the block mechanism for metadata and namespaces for a subtree of the entire hyper-document. XML is a labelling notation, not a type system. If the document loses its first-class status, which of these things should be gotten rid of? Do you want arbitrary scoping of IDs, element type declarations, entity declarations, notation declarations and namespaces? If so, you need some block mechanism to allow these. If not, what are you proposing: universal scope? all typing to be performed out-of-band (i.e, external M). > It is >interesting that the concept of document (even physical document as >file) has crept into programming languages, and has caused problems >there as well. The C language include directive is a physical rather >than a logical mechanism. When you try to build a database-driven >incremental build system, includes become problematic. Ah, so your point is relational databases don't support XML entities. Actually, relational databases support XML entities but not XML elements. I hope we are not going to lurch into some discussion of the benefits of relational models rather than network models...please please please leave that for some other mailing list. >I would like to encourage the XML community to 1) pay attention to the >lessons of 30 years of development in the arena programming and type >languages, and 2) not get bogged down by the historical baggage of the >M in XML. Has history shown us that block/functions/classes/modules/packages are bad things? On the contrary, the lesson of the last 30 years of development is that it is vital to large systems to be able to package things neatly: I would say that XML needs to enhance the possibilities of what a document is, not get bogged down by this historical baggage of relational databases. Indeed, history has shown us that when people try to avoid the M, they make monolothic, proprietory, binary systems that are hard to maintain or distribute, and which don't allow incremental enhancements or data annotation, or which they suddenly find leaves out major parts (notably, internationalization) or which can only be used by gurus and those with specialist tools. Look at the graveyard of compound document systems. However, if you are also saying that there is enormous scope for reconciling declarative programming and X*L, then I certainly agree with you. I certainly expect to see some kind of prolog-(i.e., the logic programming language)-in-XML system sometime (perhaps RDF provides part of this), for example. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Sat Feb 20 13:30:01 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:15 2004 Subject: ModSAX (SAX 1.1) Proposal Message-ID: <001601be5cd4$6f4704e0$c9a8a8c0@thing2> From: James Clark >The ModHandler class seems particularily useless. It just creates a >completely unnecessary dependency between handler classes and the SAX >package. You could use Object just as well. Rather strongly stated, but this was my initial reaction as well. And I have not been able to think of a counter argument. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Sat Feb 20 13:47:07 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:15 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 In-Reply-To: <014501be5ca6$9c02ac60$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: <199902201346.IAA17027@hesketh.net> Jeffrey E. Sussna wrote: >I would like to encourage the XML community to >1) pay attention to the lessons of 30 years of development in the arena >programming and type languages, and >2) not get bogged down by the historical baggage of the M in XML. And then Paul Prescod wrote: >The role of the information set requirements committee is to make explicit >the implicit data model described in vague terms in the XML specification. >Their responsibility is not to point XML in a bold new direction. And then James Tauber wrote: >XML came about to solve issues in structured document interchange on the >web. Perhaps largely because the language syntax was developed more quickly >than linking or stylesheets, people started using XML for a broader range of >problems. > >If XML meets the needs of people working on those broader range of problems, >then they can go ahead and use XML, but I would be extremely disappointed >(as I'm sure many people who contributed to the original development of XML >would be) if XML shifted away from being a solution for structured document >interchange. > >If XML "moves away from being markup oriented" as you suggest is already >happening, then XML will no longer be a solution to the very problems it was >designed to solve. > >Long live XML for publishing! It seems that some of XML's original denizens aren't too happy about proposals for making XML useful in a broader set of fields than document publishing and interchange. Paul pours cold water on having the Infoset group ponder anything new, and James says he'll be disappointed because XML is no longer focused on its original problem set. Why so glum? This is hardly a bomb-throwing anarchist's proposal. XML has proven to have a lot of appeal to people outside of the original document-oriented audience (which includes myself). The delays in tools for managing, creating, linking, and presenting XML documents have left XML without very much to do for documents - presenting unlinked documents in beta viewers isn't especially exciting, and so far XML hasn't made much of a dent on its original claim to be 'SGML for the Web'. Instead, XML has moved into a lot of new fields, driven by the needs of programmers who finally have a very standardized syntax for communicating all kinds of information (okay, fine, not binary data) in a way that's easy to debug (dump to file, and you might actually find something!) and (wow!) works across different environments and platforms. Throw in ease of transformation (with XSL or whatever) and it's an amazing package. If you take the stance that XML is about documents and only about documents, it may seem that these proposals for XML are in competition with document-oriented applications. They may even drain resources needed for the further development of document-oriented applications, whether or not they are 'officially' in competion. (Much the way I see XSL in regard to CSS, actually.) The document-based 'hammer' is getting used on screws, from this point of view. On the other hand, these folks are interested in XML, ready to make it work, and in some cases, creating real implementations. Astronomical Instrument Markup Language (AIML) uses XML in support of command and control of remote astronomical devices, building user interfaces and sending commands based on information in XML documents. Coins and MDSAX can pretty much construct a Java program for you using XML (and not just a single vocabulary) to knit together a group of objects. There's a lot of impressive work out there that isn't in the realm of traditional 'documents'. As far as streaming is concerned, it seems like the hardest thing in its way is the prolog and the requirement of a root element. Establish the prolog information at the start of the stream, figure out a way to end a stream, and go. That Extensible Protocol (XP) thing I mentioned earlier might be an easier way to deal with these issues than changing XML itself - it's a draft at the IETF that seems to address many of the streaming concerns, without having to explode documents. See http://www.ietf.org/internet-drafts/draft-harding-extensible-protocol-00.txt for details. Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sat Feb 20 14:19:35 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:09:15 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 Message-ID: <006401be5cdc$546f1880$4ef96d8c@NT.JELLIFFE.COM.AU> From: Simon St.Laurent > Paul pours cold water on having the Infoset >group ponder anything new, and James says he'll be disappointed because XML >is no longer focused on its original problem set. No, Paul just said that the Info Set group are not charged with doing anything new, just with documenting what exists. And James said he doesnt want anything that may impede XML's usefulness, not that new uses are not welcome. (pauses to spit out some squid from the supposed "coconut bun"...yuch) Indeed, I am sure that ISO SC34 would be very interested in well-reasoned extensions/simplifications/alterations of SGML which have industry support, test implementations, and which move or optimise SGL for other classes of distribution media. Now that the spec is out, everyone who needs something else will try to change it; but it may be better to use the particular tradeoffs of the other medium to create an entirely new (SGML-based, XML-influenced) language or notation(rather like that XML-in-ASN.1 that some of the telecomms people have proposed). >If you take the stance that XML is about documents and only about >documents, it may seem that these proposals for XML are in competition with >document-oriented applications. Do you mean "documents" (packages of structured information) or "electronic publishing"? I think you mean the latter. XML-like SGML has been used for years in applications other than publishing: someone told me that Xerox have used an XML-like syntax to deliver copier diagnostics to repairmens' PDAs for almost 10 years now. (Can anyone confirm this?) HyTime was encouraged in part because of CIA interest in languages for orchestrating satellite movements, I have been told, too. Indeed, HyTime grew out of a desire to formally analyse performances of music. Music performances, satellite movements, diagnostic data: these are not "publications" but they all can be "documents". Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Sat Feb 20 14:34:51 1999 From: Michael.Kay at icl.com (Michael.Kay@icl.com) Date: Mon Jun 7 17:09:15 2004 Subject: Announcement: XML Infoset Requirements Published Message-ID: <93CB64052F94D211BC5D0010A80013310EB318@wwmessd3.bra01.icl.co.uk> > > I am happy to announce that the W3C's XML Information Set Working > Group has published its Requirements and Design Principles document at > the following location: > > http://www.w3.org/TR/NOTE-xml-infoset-req > I'm a suspicious sort of bloke: what inference can I make from the absence of any reference to Namespaces in these requirements? Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Sat Feb 20 14:38:46 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:09:15 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 Message-ID: <003701be5cde$c62410a0$0300000a@othniel.cygnus.uwa.edu.au> >It seems that some of XML's original denizens aren't too happy about >proposals for making XML useful in a broader set of fields than document >publishing and interchange. Paul pours cold water on having the Infoset >group ponder anything new, and James says he'll be disappointed because XML >is no longer focused on its original problem set. Let me clarify: I have no problem with XML being used beyond its original problem set. What I have a problem with is the notion that XML should forget trying to solve its original problem set. Jeffrey Sussna was suggesting that XML is moving and should move away from "markup". I'm simply saying there are some people who still want the M in XML. Sure, use XML for other things too. The wide range of applications people are finding for XML excites me. But there are people that want to use XML for markup and they should not be forgotten. And pre-empting those that may label me some sentimental SGML old-timer, can I point out I was not even in high-school when SGML became an ISO standard. I was struck by your clause "XML is no longer focused on its original problem set". Are you saying the W3C has changed its view of what XML is for? >The delays in tools >for managing, creating, linking, and presenting XML documents have left XML >without very much to do for documents - presenting unlinked documents in >beta viewers isn't especially exciting, and so far XML hasn't made much of >a dent on its original claim to be 'SGML for the Web'. But this does not mean we should abandon that aim. That's my whole point. If at the end of the day (and we are not there yet, remember) XML solves all sorts of problems in application interoperation, object serialisation and so on but does not solve the problems of document interchange, then XML has failed. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Sat Feb 20 15:14:23 1999 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:09:15 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 References: <006401be5cdc$546f1880$4ef96d8c@NT.JELLIFFE.COM.AU> Message-ID: <36CED0A8.1664@hiwaay.net> Rick Jelliffe wrote: > > Do you mean "documents" (packages of structured information) or > "electronic publishing"? I think you mean the latter. I think they are hung up in the venerable "what is a document" argument. The answer is, whatever you want it to be as long as you stick to the markup standard. That's not hard. > XML-like SGML has been used for years in applications other than > publishing: someone told me that Xerox have used an XML-like syntax to > deliver copier diagnostics to repairmens' PDAs for almost 10 years now. > (Can anyone confirm this?) I can confirm they used an SGML-like tagging language for a print system. It was famous for excluding attributes in the design. It was deployed at USA MICOM. It was considered "that thing in the corner" because in not sticking to the standard it presented it's owners with island of automation problems. Markup was adopted to integrate documentation production and distribution processes and to enable information lifecycles that were longer than the lifecycles of the host systems. There has been over the years many attempts to adopt a procedural/programming design in the context of markup. That is another long story. > HyTime was encouraged in part because of CIA interest in languages for > orchestrating satellite movements, I have been told, too. Indeed, HyTime > grew out of a desire to formally analyse performances of music. That is close. HyTime had its origins in the desire to create a music description language. That necessitated a timing model. It was postulated by several interested observers (from CIA, CALS vendors, and a fellow from God's Brain (inside joke)) that a generalized timing model which included synchronization could be applied to managing very large and distributed enterprise processes which included NASA launches, process/control design for manufacturing (simulation), and so on. Indeed, for the general case, this was true. HyTime had several areas of interest including defining the general classes of hyperlinking and addressing. Timing models for real time systems are difficult to generalize because of issues like "continuous vs discrete", what to do about race conditions, event fanout, etc. The VRML community took these same issues up. It is very thorny across a network with unpredictable delivery (eg, the WWW) for distributed simulations. To summarize, the HyTime models could probably be used for documenting historical performances, but might not work well for real-time control. > Music performances, satellite movements, diagnostic data: these are not > "publications" but they all can be "documents". It may be that the reverse is the case. Given what markup does best, it may be that it is indeed a publication. That begs the question of "what is a document" but that question never gets answered outside the parameters of what the standard defines. The point at which you take the M out of XML, you are defining a different metalanguage for a different set of requirements, therefore, a different standard. This is not the charter of the XML Information Set WG. They have a constained task task which the HyTime efforts have proven is hard enough without more rabbit trails. Think long and hard. There are several interdependent efforts moving in parallel and other language efforts outside the core XML standards which depend on these (eg, X3D). If rabbit trails impact these deliverables, then the decisions by the W3C and W3D consortia to close the WGs to members and invited experts are justified. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Sat Feb 20 15:17:25 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:15 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 In-Reply-To: <003701be5cde$c62410a0$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: <199902201515.KAA17755@hesketh.net> At 10:38 PM 2/20/99 +0800, James Tauber wrote: >>It seems that some of XML's original denizens aren't too happy about >>proposals for making XML useful in a broader set of fields than document >>publishing and interchange. Paul pours cold water on having the Infoset >>group ponder anything new, and James says he'll be disappointed because XML >>is no longer focused on its original problem set. > >Let me clarify: I have no problem with XML being used beyond its original >problem set. What I have a problem with is the notion that XML should forget >trying to solve its original problem set. I would like very much for XML to solve its original problem set; I don't see that, however, as a reason not to pursue other possibilities. It sounded in your earlier message as if you saw modifications of XML to accomodate these as a distraction from the 'core' task of XML for document representation. >Jeffrey Sussna was suggesting that XML is moving and should move away from >"markup". I'm simply saying there are some people who still want the M in >XML. Sure, use XML for other things too. The wide range of applications >people are finding for XML excites me. But there are people that want to use >XML for markup and they should not be forgotten. Should not be forgotten, yes, but privileged? I don't know about that. >I was struck by your clause "XML is no longer focused on its original >problem set". Are you saying the W3C has changed its view of what XML is >for? No, the W3C hasn't (publicly), but the world is doing so every day. XML implementations in the document space have been slow to arrive and underpowered when they do. The current state of browsers and editors speaks to this painfully and vividly. On the other hand, implementations focused on interchange and other data-focused solutions seem to be growing quite happily. While the delays in the document space may be caused by the complexity of the problems involved, I'd like to see the projects outside of that space get some attention as well. >>The delays in tools >>for managing, creating, linking, and presenting XML documents have left XML >>without very much to do for documents - presenting unlinked documents in >>beta viewers isn't especially exciting, and so far XML hasn't made much of >>a dent on its original claim to be 'SGML for the Web'. > >But this does not mean we should abandon that aim. That's my whole point. I don't think I've proposed anything that would require abandoning that aim; I don't think Jeffrey Sussna proposed anything of that sort either. It might require considering accomodating other possibilities more seriously, but I don't think abandonment of documents and the Web is on the table. Describing documents as a particular set of other components is definitely on the table (if I read Jeffrey's original proposal correctly), but that is hardly a barrier to using XML to create and manage documents. >If at the end of the day (and we are not there yet, remember) XML solves all >sorts of problems in application interoperation, object serialisation and so >on but does not solve the problems of document interchange, then XML has >failed. Well, we've had plenty of arguments about what constitutes failure on this list. I suspect that supporting development in other areas of XML implementation will lead to more tools and more interest in using XML for document management, but I could be wrong. Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sat Feb 20 15:52:52 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:09:15 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 Message-ID: <007301be5ce9$5acc33e0$4ef96d8c@NT.JELLIFFE.COM.AU> From: len bullard > >> XML-like SGML has been used for years in applications other than >> publishing: someone told me that Xerox have used an XML-like syntax to >> deliver copier diagnostics to repairmens' PDAs for almost 10 years now. >> (Can anyone confirm this?) > >I can confirm they used an SGML-like tagging language for a print >system. It was famous for excluding attributes in the design I know of another *major* printing company who told its customers never to use attributes (they took a bit longer to program). People have tried everything. But can anyone confirm about the diagnostics? Rick xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Sat Feb 20 15:55:47 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:09:15 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 References: <199902201346.IAA17027@hesketh.net> Message-ID: <36CED6DF.45D8951A@prescod.net> "Simon St.Laurent" wrote: > > It seems that some of XML's original denizens aren't too happy about > proposals for making XML useful in a broader set of fields than document > publishing and interchange. Paul pours cold water on having the Infoset > group ponder anything new, and James says he'll be disappointed because XML > is no longer focused on its original problem set. > > Why so glum? I'm not glum. It just is not the mandate of the infoset group to invent new purposes for XML. The infoset group is exactly like a supreme court interpreting -- but not changing -- the constitution, which in this case is the XML specification. The terminology used in the XML specification is "document". Therefore that should be the terminology used by the infoset people. As far as a "document focus" being limiting: XML's current popularity in all sorts of fields indicates that that is not the case. If you take a database and encode it for transmission over a wire then you have a document. If you encode a message from one computer to another then you also have a document. I don't see this view as in any way limiting XML's problem domain. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "Intel has a big bull's-eye on its forehead because everyone is gunning for it. But they have to be as nimble and aggressive as they were when they were a small company," Mr. Howe said. "It's easy to be nimble when you're a $5-billion company, it's a whole other thing when you're a $40-billion company." - http://www.globeandmail.ca/gam/ROB/19990217/RPENT.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sat Feb 20 16:33:06 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:15 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 Message-ID: <3.0.32.19990220083057.00b9f2e0@pop.intergate.bc.ca> At 08:49 AM 2/20/99 -0500, Simon St.Laurent wrote: >It seems that some of XML's original denizens aren't too happy about >proposals for making XML useful in a broader set of fields than document >publishing and interchange. Paul pours cold water on having the Infoset >group ponder anything new, and James says he'll be disappointed because XML >is no longer focused on its original problem set. I think that it's massively cool that XML is turning out to be useful in so many areas. I also think that in particular, the problem area people here have been kicking around - how to deal with an endless stream of XML chunks - is one that really needs attention, and I know for a fact that other development groups here and there are grappling with it. There's another problem that may be related and may be distinct; that's the packaging problem - how do you package up a bunch of related chunks of XML (document + stylesheet + schema, or web of hyperlinked documents, or small set of documents comprising a transaction) for delivery? Is MIME the answer? Is this the same problem as the infinite-stream problem? I strongly encourage the denizens of this list to invest some time in attacking these problems, and if there are places where standards need to be written, go write them. Where I'd side with Paul Prescod is that XML currently has a big hole in that the spec doesn't say what the "official" contents of an XML document are, and someone needs to write this down. This can be done straightforwardly and I think it needs to be done in the Infoset work. It shouldn't be held back by attempting to solve a whole bunch of other (real, important) problems. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Sat Feb 20 16:41:57 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:15 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 Message-ID: <001a01be5cef$3f94ad40$c9a8a8c0@thing2> From: Paul Prescod >I'm not glum. It just is not the mandate of the infoset group to invent >new purposes for XML. The infoset group is exactly like a supreme court >interpreting -- but not changing -- the constitution, which in this case >is the XML specification. The terminology used in the XML specification is >"document". Therefore that should be the terminology used by the infoset >people. > >As far as a "document focus" being limiting: XML's current popularity in >all sorts of fields indicates that that is not the case. If you take a >database and encode it for transmission over a wire then you have a >document. If you encode a message from one computer to another then you >also have a document. I don't see this view as in any way limiting XML's >problem domain. The issue here isn't a matter of inventing new purposes for XML. The issue is more a matter of recognizing what is happening. API like SAX and SAXON appear to have broad applicability beyond applications where everything needs to be read into memory. It isn't just streams, but very large documents too. The W3C's DOM and XSL are far too expensive (and even unnecessary!) for the majority of XML's applications. The real advantage of considering streams is that you are also accommodating very large documents as well. This is an important consideration for the majority of the XML community as it exists today. Things like DOM and XSL and Coins all have their place. But SAX and SAXON and MDSAX are probably more important. You will observe that SAXON 4.0 makes it easy to use DOM as needed, providing good integration between DOM and non-DOM processing. The same is happening with MDSAX/Coins. I think the request here is that the W3C simply give some consideration for large document and stream processing. Not doing so could create real problems for the entire industry. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Sat Feb 20 17:36:13 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:09:15 2004 Subject: Documents and Document Fragments (Was RE: XML Information Set Req uirements, W3C Note 18-February-1999) Message-ID: Walter Underwood wrote: > At 02:27 PM 2/19/99 -0800, Jeffrey E. Sussna wrote: > >I think this issue needs to be addressed. It may be the case that > >the stream contains, not one, but many documents, where each > >information "packet" is a document. Or perhaps the afore-mentioned > >notion of "document fragment" is introduced, and each packet is a > >fragment. > > [snip] > A series of document-packets (packuments?) > should work fine. I'm not sure whether introducing new terminology - fragments, packuments, etc. - clarifies anything. As I read XML 1.0 there is nothing wrong with interpreting an XML 'file' or 'stream' as being made up of a number of XML documents. Many of the discussions that have taken place on this list have been a little confusing due to the physical and logical notions of a document being merged. To explain - from 2.8, we know that: Hello, world! is a well formed document, and so is this: Hello, world! So, you could say that:
Para 1 Para 2 Para 3
Para 1 Para 2
Para 1 Para 2 Para 3 Para 4
is one document for an issue of a magazine, but it also 'contains' three more documents - one for each article in that issue. A closing element is therefore effectively the end of a document - even if that document may be inside another document (in the *logical* sense in which the word is used in the spec.) I don't think we therefore need the notion of a 'document fragment', because in XML 1.0 terms a fragment *is* a document. Whether this approach is of any use to you obviously depends on what you are doing. In our case we have stored all the data that makes up the articles and issues of a magazine in an object-type database, and then built interfaces onto it that allow any node and its children to be exported as XML, as if they were a document. This means that the notion of a document that we normally have (the physical one) is no good, since all 'documents' are dynamic and can start at any point in the tree. More than that they could be the result of queries which combine nodes from separate areas (say all articles about India, no matter what issue they appear in) or they could be a subset of children from a node (all articles in a certain issue that are by one author). So, this interpretation of a document is crucial in situations of dynamic XML export. As Marc says: > What could be accomplished is a unified solution to problems addressed > and/or recognized in SAX, XSL, queries, DOM, and fragments. It also > provides a model for a data server as an XML 'document' constructor. We now treat our web servers logically as 'XML servers', with either one massive document on or thousands of smaller ones, whichever way you want to slice it. (BTW, DTDs can be dynamically created too, if you're worrying that this presentation only deals with well-formed documents.) Regards, Mark Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Sat Feb 20 17:36:15 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:09:16 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) Message-ID: On streaming ... I completely agree with - and we have implemented - Marc's notion: > What could be accomplished is a unified solution to problems addressed > and/or recognized in SAX, XSL, queries, DOM, and fragments. It also > provides a model for a data server as an XML 'document' constructor. We now treat our web servers logically as 'XML servers', with either one massive document on or thousands of smaller ones, whichever way you want to slice it. The delivery of those documents in a 'stream' is marked by the opening and closing elements. Anything before the opening element is 'prolog', but is not necessary; it may give the recipient additional information as to what to expect, such as XML version number or DTD. And anything after the final element is not part of the document, so the stream can be 'closed' when the final element is received. I don't think, therefore, that things are as complicated as Simon implies: > As far as streaming is concerned, it seems like the hardest thing in its > way is the prolog and the requirement of a root element. Establish the > prolog information at the start of the stream, figure out a way to end a > stream, and go. We already have a way to end the stream - with the closing element. And the prolog is just the prolog for the document. I think part of the problem here is when people try to map the stream itself to a document. You end up with an extra layer of document that is not really part of your data and confuses things. Take Marc's example of regular transmissions of: 25 N What other information does your server need? You have the start and end of stream info with the element tags. You could make it more sophisticated by sending the DTD along too, but otherwise we have everything we need to delineate. BUT ... it would be odd programming practice to then wrap these individual documents in a bigger document that represents the stream, because you are no longer representing your data, you're representing the carrier. (That doesn't preclude storing it for later use wrapped in a containing element, but we are talking about the input stream here.) Which is why I have to disagree with the following comments: John Cowan wrote: > Nathan Kurz wrote: > > And if the stream is continuous (for example, an XML > > stock ticker) even the concept of a well-formed stream seems tenuous. > > It's not clear that XML supports infinitely long streams (where the > end-tag of the document element is *never* reached). Firstly on the level of XML, since it *is* clear that XML does *not* support infinitely long streams. An element is not an element without its closing tag, and a document is an element. But secondly, why would you do this anyway? If you have a series of stock prices being passed down a wire, why do you then want to prefix them by an opening element that says 'this is a stream of stock prices'. It tells us about the medium, not the data - we already know that each packet is a stock price. It's a bit like going: MSFT 1000 IBM 1000 You have put into your data information about the data's carrier - 'this is a carrier for stock prices' - which you'd kinda hope the receiving application knew already! If you further think through real-world examples, then this 'open a stream for the rest of the day' method becomes even worse. Take the UK Stock Exchange data. They have seven or eight data sources that pump out data all day long. One has the bid and offer prices as they're changed by market-makers, another would be the volumes of trades, another would be news headlines, and so on. If we say 'here is a document of news headlines' in the morning, and don't send the closing element 'till after tea, then we can't put anything on that wire other than news headlines (and really you shouldn't process anything until you receive that closing element, but I know that's what people are requesting they can do). However, if you treat the wire as 'stateless', you simply send each document as a self-contained entity - news, quotes, trades and so on, as well as types not yet invented. Now, there is nothing wrong with having stream information in the stream itself. Of course you'd want to be able to have: MSFT 1000 MSFT 1002 In other words, the 'stream' contains stream-control data - regular timestamps to synchronise clocks, status information if the stream is to close for maintenance and so on - but that is not part of some super-mega-stock price document. In the past, streams of data like this were encoded with all sorts of checksums and so on, to ensure the accuracy of transmission. But there was little that could be done about the accuracy of the data in its internal relationships. This had to be encoded into all receiving applications, and became very difficult to maintain. Now, however, we can send a DTD down with the data which gives an indication of what the data is meant to look like. On failure the recipient knows exactly which node has failed, and could even re-request just that node. (In the past you'd need the whole packet.) Regards, Mark Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Sat Feb 20 18:27:02 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:09:16 2004 Subject: Documents and Document Fragments (Was RE: XML Information Set Req uirements, W3C Note 18-February-1999) In-Reply-To: Message-ID: Hi Mark ..... We now treat our web servers logically as 'XML servers', with either one massive document on or thousands of smaller ones, whichever way you want to slice it. If I understand you well, what you are saying is that a stream could start with the processing instruction the PI indicates that the format following the PI is now a XML format. Then the processor at the other end of the stream would process each begin-end markups as small information units. Is it what you are saying? The concept seems appealing and easy to implement. I guess that the problem resides with the word document and the meaning legacy that this word convey (about 5000 years with the notion of a document as a physical entity with ink on it :-). In our specs we are doing what marketing people call "name extension" use the same word everywhere because it sells. It seems that the word document (again because of all the legacy meaning) convey restricted understanding of what we can do with XML. Probably, "information unit" would be more appropriate. I understand also that W3C has to operate with legacy too. That legacy is called a file and most of the time we get the implicit equation document = file. I agree that a file could be mentally perceived as closer to a physical (mean here paper) document than a stream which has its physical world equivalent more as a river or as a road. So I guess that the word format would be more versatile and could be adapted either to a document (i.e. file) and to a stream. It would convey the meaning that the content is formatted with xml structure. Then, a markup could be called an information unit and be perceived as a single information unit (obviously). Documents can be constructed with "information units" and stream convey "inforamtion units". This would have the advantage to apply to each world: a) documents = files and b) streams. ------------------ At one time in history we where calling our transportation vehicule a "horse". Imagine the confusion if we where still calling our car a "horse". How would you call a horse then? Is it what we do with the word "document"? ---------------- So Mark, your comment about XML servers is useful as well on the conceptual point of view as on the practical point of view. Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Sat Feb 20 19:59:30 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:09:16 2004 Subject: FW: XML Mail In-Reply-To: <36CDBA46.B13FB101@locke.ccil.org> Message-ID: <001201be5d0a$cff07110$d3228018@jabr.ne.mediaone.net> John Cowan wrote: > > > Robert Hanson wrote: > > > Am I wrong in saying that the answer would be that the symbol > ">" should not > > show up in the non-XML text anyway? Wouldn't it be converted > to an entity > > reference? Like this... > > > > "]]>" in the text body would become "]]>" This is of course how the text "]]>" would be escaped in a text block. The issue of dealing with "]]>" arises when we are in a CDATA block. In this case "]]>" remains CDATA, so John's suggestion about creating two CDATA sections is a good one. > > The point was that conversions like ">" becomes ">" aren't > necessary if you use CDATA sections, only the one bad case of > "]]>" which is illegal in plain XML character data as well as > terminating the CDATA section, and so must become something like > "]]]]>" to end the > CDATA section plus " plus ">". There are other ways to do it, of course. > > -- What I have done in XMTP is to CDATA all text based non-xml bodies (i.e. mime:BODY) but escape the text nodes of mime headers. e.g. Message-ID: becomes: <sample@somewhere> as opposed to: <[CDATA[]]> just because, at the moment, this results in an average slightly larger document (granted just a few bytes) and also because of vague stylistic leanings. The best reason to <[CDATA[ the bodies is that it makes looking at enclosed HTML ever so much easier. Is there a good reason to prefer one over the other for headers? Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Sat Feb 20 23:57:27 1999 From: richard at goon.stg.brown.edu (Richard L. Goerwitz III) Date: Mon Jun 7 17:09:16 2004 Subject: Confusion about conditional sections Message-ID: <199902202357.SAA17250@goon.stg.brown.edu> > Once you see a conditional ignore section, can you effectively... > just scan for parts of the text in side there without > doing regular parsing? As several others have pointed out, productions 61-65 don't quite say this. You still have to be sure that are nesting cor- rectly, even inside of ignored sections. Just to re-quote the prose section that explains this point: > Note that for reliable parsing, the contents of even ignored > conditional sections must be read in order to detect nested con- > ditional sections and ensure that the end of the outermost (ig- > nored) conditional section is properly detected. In practical use, I'm not sure how useful this behavior is, because one of the main uses of conditional sections, for many shops, is to divert SGML and XML portions of a DTD into their own conditional sections. You want to be able to simply cut and paste, and not worry about whether quoted strings inside markup or comments contain a ]]> sequence. If the reason the standard was worded the way it was is to make pro- cessing simple, than I can only shrug. XML is looking as though it is going to become a monster. Asking parsers to pay attention to strings and comments inside of ignored conditional sections is not going to make it much more difficult to grok than it already is, and it may eliminate a source of problems. ================= test.xml ================== ]> à ================= test.dtd ================== ]]> sequences in either of these - contexts. --> -->"> "> "> ]]]> ]]> Richard Goerwitz Scholarly Technology Group xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Sun Feb 21 00:37:01 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:16 2004 Subject: ModSAX (SAX 1.1) Proposal Message-ID: <019201be5d32$15b69db0$2ee044c6@arcot-main> >The ModHandler class seems particularily useless. It just creates a >completely unnecessary dependency between handler classes and the SAX >package. You could use Object just as well. Two reasons: 1. Using java.lang.Object will make ModSAX Java-dependent. 2. Using ModHandler interface adds some compile-time type-check. Only penalty is increasing the number of interfaces a class implements which does affect the performance during runtime. Very little penalty in most cases. Most Java VM implementations search the interface list back to front so that most often used interface (i.e. org.w3c.dom.Node) should be the last interface in the 'implements' list. Best, Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sun Feb 21 01:32:23 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:09:16 2004 Subject: ModSAX (SAX 1.1) Proposal References: <019201be5d32$15b69db0$2ee044c6@arcot-main> Message-ID: <36CF5AB7.A3722BDD@jclark.com> Don Park wrote: > > >The ModHandler class seems particularily useless. It just creates a > >completely unnecessary dependency between handler classes and the SAX > >package. You could use Object just as well. > > Two reasons: > > 1. Using java.lang.Object will make ModSAX Java-dependent. But SAX already uses java.lang.String. How will avoiding java.lang.Object reduce the degree of Java dependency? When translating into other languages, you already have to translate java.lang.String into the appropriate type (eg BSTR for COM)? You can do the same for java.lang.Object (eg use IUnknown for COM). > 2. Using ModHandler interface adds some compile-time type-check. Using ModHandler doesn't provide type-safety. Suppose I do: parser.setHandler("org.xml.sax.namespace", nsHandler); Now the handler org.xml.sax.namespace will need to be of some specific type, org.xml.sax.NamespaceHandler, say. What needs to be checked is that nsHandler is of type org.xml.sax.NamespaceHandler. Using ModHandler doesn't do that. In fact there's a much higher degree of type-safety not using ModParser and ModHandler at all. Take my example again: interface PingParser extends Parser { void setPingHandler(PingHandler handler); } interface PingHandler { void ping(); } void registerPingHandler(Parser parser, PingHandler handler) { try { ((PingParser)parser).setPingHandler(handler); } catch (ClassCastException e) { // it doesn't support Ping } } There's no possibility here of passing the wrong type of handler here to setPingHandler. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Sun Feb 21 02:51:30 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:16 2004 Subject: New MDSAX and Coins release Message-ID: <002e01be5d44$65f273a0$c9a8a8c0@thing2> MDSAX1.0 beta 3 is now available at http://www.jxml.com/mdsax/index.html This release includes: o Coins implemented over MDSAX o A light weight version of program composition which does not require a DOM. o Attribute support which is a superset of DTD attribute specification, including global attributes and attributes specific to a parent context. This is Open Source Software: http://www.jxml.com/License.txt (Full source is included in the download.) Check it out and please, let us know what you think. Thankyou. Bill la Forge mailto:b.laforge@cybercom.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Sun Feb 21 03:38:45 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:09:16 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) In-Reply-To: ; from Mark Birbeck on Sat, Feb 20, 1999 at 04:08:24PM -0000 References: Message-ID: <19990221143828.A1439@io.mds.rmit.edu.au> On Sat, Feb 20, 1999 at 04:08:24PM -0000, Mark Birbeck wrote: > ... Take the UK Stock Exchange data. They have seven or > eight data sources that pump out data all day long. One has the bid and > offer prices as they're changed by market-makers, another would be the > volumes of trades, another would be news headlines, and so on. If we say > 'here is a document of news headlines' in the morning, and don't send > the closing element 'till after tea, then we can't put anything on that > wire other than news headlines (and really you shouldn't process > anything until you receive that closing element, but I know that's what > people are requesting they can do). I disagree with that last parenthesised remark. Stream-based parsers do and indeed should process data as it arrives. XML browsers _most certainly_ should do so. Not that I disagree with your overall point (I haven't really given it that much thought), but the above is definitely wrong IMO. Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Sun Feb 21 04:02:26 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:09:16 2004 Subject: Documents and Document Fragments (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) Message-ID: <001801be5d4f$0d94aec0$0300000a@othniel.cygnus.uwa.edu.au> >is one document for an issue of a magazine, but it also 'contains' three >more documents - one for each article in that issue. A closing element >is therefore effectively the end of a document - even if that document >may be inside another document (in the *logical* sense in which the word >is used in the spec.) I don't think we therefore need the notion of a >'document fragment', because in XML 1.0 terms a fragment *is* a >document. We must be careful when using the word "document" because it does have a specific meaning in the XML spec. It is *NOT* true that a document may be inside another document in the logical sense in which the word is used in the spec. An XML *document* (document in the spec sense) is PROLOG+ELEMENT+optional EPILOG. You can't have PROLOGs in the content of elements, therefore you cannot have document in documents if you mean document in the spec sense. (note that all XML document have a prolog even if it is empty). This is exactly why I introduced the notion of an ?berdocument in a previous thread on this list about my vision for an OS shell that treats a file system as XML. The reason I used the term ?berdocument is that terms like "document" and "logical" have a particular meaning in XML. In the usual sense of the word logical, an ?berdocument is logically but not physically a document. But in the XML sense of the word logical, it is not logically a document at all. >Whether this approach is of any use to you obviously depends on what you >are doing. In our case we have stored all the data that makes up the >articles and issues of a magazine in an object-type database, and then >built interfaces onto it that allow any node and its children to be >exported as XML, as if they were a document. This means that the notion >of a document that we normally have (the physical one) is no good, since >all 'documents' are dynamic and can start at any point in the tree. The key is when you say "*as if* they were a document". An element within an XML can be cut off and promoted to full blown "document" status, sure, but while it is an element in an XML document it is not an XML document. I agree that the notion of a physical document is no good. I have actually submitted a poster to WWW8 entitled "Rethinking websites as single documents" which discusses many of these issues. Basically I argue that the model of SGML document + stylesheet -> physical document shouldn't carry over to the web. We should not see XML document + stylesheet -> web page. Because the interrelatedness of web pages on a particular site is far greater that the interrelatedness between separate physical documents. Instead I suggest the view XML document + stylesheet -> web site. Actually I'm being sloppy here because for scalability reasons I suspect larger sites will be represented by an ?berdocument rather than a single XML document. Of course an ?berdocument can always be thought of as an XML document (a logical one in the general sense of logical, not the XML spec sense) just as a element can be made into a full-blown XML document. >We now treat our web servers logically as 'XML servers', with either one >massive document on or thousands of smaller ones, whichever way you want >to slice it. Yep. This is the idea I'm exploring. I'm just using the term "?berdocument" for the "one massive document". James -- James Tauber / jtauber@jtauber.com / www.jtauber.com Associate Researcher, Electronic Commerce Network Curtin University of Technology, Perth, Western Australia Full-day XML Tutorial @ WWW8 : http://www8.org/ Maintainer of : www.xmlinfo.com, www.xmlsoftware.com and www.schema.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Sun Feb 21 12:54:07 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:16 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 Message-ID: <006c01be5d77$3dc0ced0$2ee044c6@arcot-main> >Nathan Kurz wrote: > >> And if the stream is continuous (for example, an XML >> stock ticker) even the concept of a well-formed stream seems tenuous. > >It's not clear that XML supports infinitely long streams (where the >end-tag of the document element is *never* reached). I remember bringing this issue up long time ago under a different name: Endless Documents. There was no single solution to this problem that worked for all situations. For certain applications, using well-formed external entities worked. Low-level data multiplexing works. Protocol tunneling sometimes needs to be used to get around legacy problems. The fact that DTD had to be declared at the head of a document and could not be changed or augmented afterward poses problems but I am hoping that the Schema WG will address this soon. Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Sun Feb 21 12:54:50 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:16 2004 Subject: ModSAX (SAX 1.1) Proposal Message-ID: <002501be5d74$d3ea26c0$2ee044c6@arcot-main> >But SAX already uses java.lang.String. How will avoiding I agree that the use of String marks SAX as being Java-dependent but Object issue is different because there is no obvious replacement for Object. One could use LPVOID for C/C++ but what about other languages? Anyway, this issue is not a clear cut issue so I don't think there is much point on going further with it. >parser.setHandler("org.xml.sax.namespace", nsHandler); > >Now the handler org.xml.sax.namespace will need to be of some specific >type, org.xml.sax.NamespaceHandler, say. What needs to be checked is >that nsHandler is of type org.xml.sax.NamespaceHandler. Using >ModHandler doesn't do that. It does if org.xml.sax.NamespaceHandler implements ModHandler. public class NamespaceHandler implements ModHandler {} ModParser parser; try { parser.setHandler("org.xml.sax.namespace", new NamespaceHandler()); } catch (Exception ex) {} Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Sun Feb 21 13:52:08 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:09:17 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) Message-ID: Marcelo Cantos wrote: > (I haven't really given it > that much thought), but the above is definitely wrong IMO. Thanks for your contribution Marcelo. I can feel the frontiers of knowledge being pushed back. My problem is I waste so much time thinking things through, and then carefully penning responses to see if I can contribute to the debate, when really I should just toss a coin. I could then reply "I haven't thought about this, but it's wrong (because it's heads) or it's right (because it's tails)." Thanks again Marcelo, I now have more time for roller-blading and my philately interests. Mark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Sun Feb 21 14:28:41 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:09:17 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) Message-ID: Couldn't find a coin - so I suppose I should respond: Marcelo Cantos wrote: > On Sat, Feb 20, 1999 at 04:08:24PM -0000, Mark Birbeck wrote: > >then we can't put anything on that > > wire other than news headlines (and really you shouldn't process > > anything until you receive that closing element, but I know > > that's what > > people are requesting they can do). > > I disagree with that last parenthesised remark. Stream-based parsers > do and indeed should process data as it arrives. XML browsers _most > certainly_ should do so. > > Not that I disagree with your overall point (I haven't really given it > that much thought), but the above is definitely wrong IMO. You seem to have missed the point of the discussion. The question is whether it is legitimate to open a stream of XML with some sort of element like: and then spend the rest of the day sending out things like: MSFT 1000 and then at the end of the day, sending: No-one so far in the discussion has argued that this is good XML - except you Marcelo, but you can be excused because you haven't given it much thought - because if you were validating this you should not (CAN NOT!) say the document 'stockPrices' is valid until you receive the closing element. And that would mean you couldn't process the intervening prices until you had validated the entire document, and that would mean your data feed would be useless. So, what people are discussing is whether there is any way of keeping within the principles of XML and still having an 'infinite document' or an 'open-ended document' or whatever. In other words, how can we correctly process those intervening 'stockPrice' elements when we haven't yet had the complete document to which they belong. Now, you just say 'stream-based parsers' *should* do this. But if you think that, back it up. Everyone else in this discussion has said why they are for or against such an approach. If it's obvious to you, then please share. My contribution to the discussion - which I *did* give much thought - was to try and argue that it is not very good programming practice anyway, to open a stream for 8 hours. Instead we should remove the containing 'stockPrices' document, and then send lots of 'stockPrice' documents throughout the day. This has many advantages, such as the ability to maintain consistency with current XML approaches, the ability to send multiple 'types' of data along one wire, and the ability to send a DTD with each document, or even an abbreviated DTD if required. In short, my disagreement was with trying to map 'the stream' to 'the document', rather than to 'a carrier of many documents', and I argued that we already have everything we need in XML 1.0 to implement very powerful stream processing. Regards, Mark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Sun Feb 21 14:56:18 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:09:17 2004 Subject: Documents and Document Fragments (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) Message-ID: James Tauber wrote: > We must be careful when using the word "document" because it > does have a > specific meaning in the XML spec. It is *NOT* true that a > document may be > inside another document in the logical sense in which the > word is used in > the spec. That's right ... and I'm not saying that. I put apostrophes around all sorts of key words (like the word contains) to emphasise that I am not trying to be literal. My meaning was that an element within a document can itself be treated as a document - and still fit with the spec, despite what you say next. > An XML *document* (document in the spec sense) is > PROLOG+ELEMENT+optional > EPILOG. > You can't have PROLOGs in the content of elements, therefore > you cannot have > document in documents if you mean document in the spec sense. > (note that all > XML document have a prolog even if it is empty). I was saying that an element is equivalent to a well-formed document (with an empty prolog) and this gives us certain advantages. For example, if I have in my database: object type issue, with issue number set to 67 and ... a container of articles of which ... the first is article 1 with a container full of paragraphs ... of which the first says "This is para 1" and the second says "This is para 2" and the third says "this is para 3" and the second is article 2 with a container full of paragraphs ... of which the first says "This is para 1" and the second says "This is para 2" [I spell this out because if I show it with tags, everyone will think I am referring to one *physical* XML document, which I am not.] then I can export a 'proper' XML document from this, such as:
This is para 1 This is para 2
as well as:
This is para 1 This is para 2 This is para 3
This is para 1 This is para 2
or even the 'proper' document: This is para 2 All of these are well-formed 'documents' in the logical sense but have no relationship to a physical document of any form. Of course, if all of your documents (logical) are stored as text files (physical), or to put it another way, if there is a one-to-one mapping between your physical and logical XML documents, then none of this is of any use to you; you will have a lot of trouble querying across documents, and no means of creating dynamic documents. On the other hand, if you have no documents, but thousands of nodes of data in a database that you can export and query, then the difference between a logical document and a physical one is key. (Further, you could also generate an inline DTD from your schema as the prolog to each document, if you wanted. Or just point to an external one.) I pointed out that all this fits with the XML 1.0 notion of a logical document, in order to stress that we don't need some other terms inventing to cope with these concepts. The fact that the three examples I gave above are all subsets of a greater whole, does not in any way affect that they are all still perfectly acceptable XML documents. We don't then need to go back to the original data and say that because we can get many documents from a bigger document, that document must therefore be referred to as an 'uberdocument'; to quote you: > Yep. This is the idea I'm exploring. I'm just using the term > "?berdocument" > for the "one massive document". But it's still a document (logical), just like the other three. And equally, we don't really need to say that because those three documents came from a greater document they must be 'document fragments'. (I say 'don't really', because there are situations such as getting a parser to select part of a *physical* document, when the term 'fragment' might be useful.) To conclude, there's nothing wrong with introducing new terms, but I feel that they must clarify something, or point towards something that has not been addressed before. But as far as I can see, all of the concepts we need to cope with the idea of an 'XML document server', etc., *are* present in XML 1.0. Regards, Mark Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Sun Feb 21 16:19:30 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:17 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 In-Reply-To: <199902201346.IAA17027@hesketh.net> References: <014501be5ca6$9c02ac60$0300000a@othniel.cygnus.uwa.edu.au> <199902201346.IAA17027@hesketh.net> Message-ID: <14032.9155.820259.610440@localhost.localdomain> Simon St.Laurent writes: > It seems that some of XML's original denizens aren't too happy > about proposals for making XML useful in a broader set of fields > than document publishing and interchange. Paul pours cold water on > having the Infoset group ponder anything new, and James says he'll > be disappointed because XML is no longer focused on its original > problem set. [Here's a suggestion -- whenever the XML 1.0 spec says "document", quietly substitute the word "packet" in your mind and everything will be OK.] Personally, I am very excited about the idea of using XML for streams -- that was one of my strongest motivations during the initial SAX design process, and at least personally, I'm trying hard to ensure that the Information Set is as applicable to event-based APIs as it is to tree-based APIs. Or, to put it more bluntly, Megginson Technologies is making more money helping companies with data-interchange (streaming and non-streaming) than it is in the old document-publishing market, and I'm not so selfless that I'd fail to notice a trend like that. Now, having said all of that, I'd like to go on and say that Paul was absolutely correct -- the Infoset WG is not chartered to steer a bold new course for XML; we're chartered to define the underlying information set of an XML 1.0 document, as defined in the associated W3C recommendation. Personally (again), I'd be in no hurry to rewrite XML 1.0 even if the WG were chartered to do so -- companies are reluctant to invest big dollars in unstable technologies, and with all of the higher-level changes, it's essential that XML 1.0 remain as stable as possible for at least a while longer (modulo a few errata, etc.). All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Sun Feb 21 16:19:38 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:17 2004 Subject: Announcement: XML Infoset Requirements Published In-Reply-To: <93CB64052F94D211BC5D0010A80013310EB318@wwmessd3.bra01.icl.co.uk> References: <93CB64052F94D211BC5D0010A80013310EB318@wwmessd3.bra01.icl.co.uk> Message-ID: <14032.9707.259533.856607@localhost.localdomain> Michael.Kay@icl.com writes: > I'm a suspicious sort of bloke: what inference can I make from the > absence of any reference to Namespaces in these requirements? It means that the WG could declare victory without specific namespace support if it wanted to. XML 1.0 is the bottom layer (a la IP), and Namespaces form a higher layer (a la TCP), and the Infoset could end up working at either level; what will actually come out is yet to be decided, and the WG's current thinking on the subject is unfortunately subject to the W3C's member confidentiality rules. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Sun Feb 21 16:19:41 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:17 2004 Subject: ModSAX (SAX 1.1) Proposal In-Reply-To: <36CCCEAC.612D7AD1@jclark.com> References: <3994C79D0211D211A99F00805FE6DEE249BF8D@exchny15.corp.smb.com> <14026.57951.26146.946129@localhost.localdomain> <36CCCEAC.612D7AD1@jclark.com> Message-ID: <14032.8174.849020.118617@localhost.localdomain> James Clark writes: > I don't see the point of this. It doesn't seem to buy me anything over > what I can already do using normal programming language features: > interface PingParser extends Parser { > void setPingHandler(PingHandler handler); > } [snip] Thank you very to to James for his comments. James's suggestion complicates the the chain-of-responsibility pattern for SAX programming, and that pattern has proven very productive in SAX work so far. For example, consider the following set-up: 1. the application is a client of 2. an attribute-inheritance filter, which is a client of 3. a ping filter, which is a client of 4. the XP ModSAX driver Only #3 will implement the PingParser interface, so my application will have to know to register its DocumentHandler with (2) but its PingHandler with (3). Using the ModParser interface, the application could register both of its handlers with (2), the top-level ModParser; (2) would accept the DocumentHandler object directly (as it must), and would pass the PingHandler along the chain to its parent filter; any ModHandlers that made it to the XP ModSAX driver and were not recognised would cause a SAXNotSupportedException. > The ModHandler class seems particularily useless. It just creates > a completely unnecessary dependency between handler classes and the > SAX package. You could use Object just as well. This is admittedly a matter of taste, but empty interfaces are common in the Java world as a statement of intent, and they do have the advantage of slightly-stronger type checking. I far prefer void setHandler (String handlerID, ModHandler handler) throws SAXNotSupportedException; to void setHandler (String handlerID, Object handler) throws SAXNotSupportedException; because stupid bugs are more likely to be caught at compilation time rather than runtime, and that's always a win for creating robust code. > None of this seems to solve the real problem which is actually > defining the handlers that are needed to provide the functionality > missing from SAX 1.0 (like the handlers for comments, namespaces > etc that were in the previous draft). This design provides the infrastructure for adding those handlers -- once we have a standard way of adding new handlers to SAX without rewriting the interface, we can define some implementations of ModHandler for lexical information, etc., exactly as in the previous draft. With this new infrastructure, we can also now specifically request (non-)validation, (no) namespace processing, external entity (non-)expansion, etc., and we can do capability discovery in a more robust way. A year or two from now, we won't need to issue SAX 1.2 to handle, say, data-typing. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Sun Feb 21 19:41:00 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:09:17 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 References: <001a01be5cef$3f94ad40$c9a8a8c0@thing2> Message-ID: <36D05C94.C772B698@prescod.net> Bill la Forge wrote: > > From: Paul Prescod > >I'm not glum. It just is not the mandate of the infoset group to invent > >new purposes for XML. The infoset group is exactly like a supreme court > >interpreting -- but not changing -- the constitution, which in this case > >is the XML specification. The terminology used in the XML specification is > >"document". Therefore that should be the terminology used by the infoset > >people. > > > >As far as a "document focus" being limiting: XML's current popularity in > >all sorts of fields indicates that that is not the case. If you take a > >database and encode it for transmission over a wire then you have a > >document. If you encode a message from one computer to another then you > >also have a document. I don't see this view as in any way limiting XML's > >problem domain. > > The issue here isn't a matter of inventing new purposes for XML. > The issue is more a matter of recognizing what is happening. I don't know what you are referring to in my post above. The point of my message was that there is no dichotomy between "document processing" and "stream processing." A document is a stream. You can also have a stream of documents. Therefore the terminology of XML does not need to change to support stream processing. It already supports it! Some of the features of XML could support streaming better but that isn't the infoset group's job. Even in thinking of XML 2.0 there is no dichotomy: any feature that makes XML documents more modular (i.e. local namespaces) also makes streaming easier and vice versa. They are two sides of the same coin (if you'll excuse the pun). > API like SAX and SAXON appear to have broad applicability beyond > applications where everything needs to be read into memory. It isn't just > streams, but very large documents too. The W3C's DOM and XSL are far too > expensive (and even unnecessary!) for the majority of XML's applications. I don't know what this has to do with my post above either, but I'll respond anyhow. What is an "XML application" and how does one count them? Would 100,000,000 web pages each count as an "XML application?" > The real advantage of considering streams is that you are also accommodating > very large documents as well. This is an important consideration for the majority > of the XML community as it exists today. There are too basic paths you can take: figure out how to handle extremely large documents or figure out how to combine small documents into hyperdocuments or "uber-documents." My experience is that the latter requires your systems to be more complex but also allows them to do more complex things. Nevertheless I am totally in favor of supporting the former also. (did something in my post above indicate otherwise?) Stream based document processing can be simpler than hyperdocument processing and I'm in favor of simplicity when it can be achieved. > I think the request here is that the W3C simply give some consideration for > large document and stream processing. Not doing so could create real > problems for the entire industry. I don't believe that the W3C has forgotten about stream processing. One of the more controversial parts of the XML namespace specification is intimately tied to stream processing (local namespaces). I think that I can safely say that when XML was being developed streaming uses were as high in the minds of the working group as tree-based uses. Stream based processing has always been more common in the SGML world than tree processing. Okay then, why are the DOM and XSL tree based? Well, the web infrastructure favors small documents inherently. Large streams must be broken up on the server side for performance reasons. Bandwidth, not RAM, is the limiting factor in Web user interfaces. Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "In general, as syntactic description becomes deeper, what appear to be semantic questions fall increasingly within its scope; and it is not entirely obvious whether or where one can draw a natural bound between grammar and 'logical grammar'." - Noam Chomsky, 1963 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rschoening at unforgettable.com Sun Feb 21 20:31:02 1999 From: rschoening at unforgettable.com (Rob Schoening) Date: Mon Jun 7 17:09:17 2004 Subject: Announcement: XML Infoset Requirements Published In-Reply-To: <14032.9707.259533.856607@localhost.localdomain> References: <14032.9707.259533.856607@localhost.localdomain> Message-ID: <0003446518ff95cb_mailit@mail.iname.com> >Michael.Kay@icl.com writes: > > > I'm a suspicious sort of bloke: what inference can I make from the > > absence of any reference to Namespaces in these requirements? > >It means that the WG could declare victory without specific namespace >support if it wanted to. XML 1.0 is the bottom layer (a la IP), and >Namespaces form a higher layer (a la TCP), and the Infoset could end >up working at either level; This is the crux of the problem. Putting the markup spec at the bottom of a stack, as it were, automatically gives it a more significant status than the higher-level specs. Have the working groups stopped to reflect on the signifigance of this assumption? This is a really big deal. This has all the makings of a written langauge that is too difficult to speak because the ideosycrasies of the written grammer gum up the "higher" levels. This is why no one speaks latin. This is why it was so difficult for programmers to communicate with each other without switching to another langauge altogether (like UML). This layered approach to XML is burying its potential. Rob xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sun Feb 21 20:31:20 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:09:17 2004 Subject: Announcement: XML to DDML Converter available Message-ID: <000801be5dd7$bd0a8e50$62f96d8c@NT.JELLIFFE.COM.AU> I have put OmniMark scripts to convert from XML markup declarations to DDML under www.ascc.net/xml/en/utf-8/resource_index.html I have also converted and put at the same site: the XBEL DTD, the DDML DTD, the MathML DTD, our (TEI) Lite and Loose DTD. The scripts run with OmniMark Light Edition, which is available as a freebie from www.omnimark.com Please let me know any errors. Also under that page, the Wefo Shrzi Chinese sample page is now available using a conservative CSS stylesheet. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Jon.Bosak at eng.Sun.COM Sun Feb 21 22:52:45 1999 From: Jon.Bosak at eng.Sun.COM (Jon Bosak) Date: Mon Jun 7 17:09:17 2004 Subject: XTech '99 starts in two weeks Message-ID: <199902212157.NAA03255@boethius.eng.sun.com> Final reminder: XTech '99, the third annual West Coast XML conference, starts in two weeks at the San Jose Convention Center. This is the place to learn about the latest developments in XML and related technologies from the people and companies at the center of the XML revolution. XTech itself is sponsored by Sun Microsystems and IBM; the associated interoperability expo, XIO, is sponsored by Microsoft. See http://www.gca.org/conf/xtech99/xtecindx.htm for details and registration. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Sun Feb 21 23:02:18 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:17 2004 Subject: Announcement: XML Infoset Requirements Published In-Reply-To: <0003446518ff95cb_mailit@mail.iname.com> References: <14032.9707.259533.856607@localhost.localdomain> <0003446518ff95cb_mailit@mail.iname.com> Message-ID: <14032.26223.790080.443871@localhost.localdomain> Rob Schoening writes: [on layering namespaces over XML] > This is the crux of the problem. Putting the markup spec at the > bottom of a stack, as it were, automatically gives it a more > significant status than the higher-level specs. Have the working > groups stopped to reflect on the signifigance of this assumption? > This is a really big deal. Not necessarily -- from the XML perspective, XML is at the bottom of the stack; from the RDF perspective the RDF data model is at the bottom of the stack and XML is just one possible serial representation of RDF graphs; for the DOM and XSL, XML is one possible source of information to be presented through an API or rendered/converted. Even namespaces could be abstracted to a set of conventions that are not directly tied to XML syntax, though no one has bothered to do that yet, just as TCP does not (in principle) require IP. > This has all the makings of a written langauge that is too > difficult to speak because the ideosycrasies of the written grammer > gum up the "higher" levels. If you are allergic to university lectures on the history of language, STOP READING NOW! Written language has certain predicatable characteristics -- in both English today and in Latin two millenia ago, for example, the written language has tended to be conservative both in its grammar and in its vocabulary, and its canonical usage has tended to serve as a membership function for certain professions and social classes. I know of no case of written grammar 'gumming up' the higher levels, though because of universal education, written English has had considerably more influence on spoken English than written Latin ever had on spoken Latin. > This is why no one speaks latin. Wow! That really busts up my understanding of historical phonology. Actually, if you look at the map, you'll find that the modern forms of Latin are the predominant language(s) in much of south-western Europe, most of the Western hemisphere (with the exception of the anglophone parts of Canada and the U.S.), and scattered parts of Eastern Europe, Africa, and Asia. Latin has changed over two millenia, but no more than proto-Germanic changed over the same period: from the simple perspective of geographical spread, they're both enormous success stories. > This is why it was so difficult for programmers to communicate with > each other without switching to another langauge altogether (like > UML). What does that have to do with layering? > This layered approach to XML is burying its potential. I'm having a lot of trouble following the argument -- perhaps a simple summary of the main points would help. Do you believe that layering is a problem because it's hard to keep track of or manage the different layers, or do you believe that XML should not be the base layer? All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Sun Feb 21 23:26:53 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:09:17 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) In-Reply-To: ; from Mark Birbeck on Sun, Feb 21, 1999 at 02:01:05PM -0000 References: Message-ID: <19990222102612.B18930@io.mds.rmit.edu.au> On Sun, Feb 21, 1999 at 02:01:05PM -0000, Mark Birbeck wrote: > Marcelo Cantos wrote: > > (I haven't really given it > > that much thought), but the above is definitely wrong IMO. > > Thanks for your contribution Marcelo. I can feel the frontiers of > knowledge being pushed back. > > My problem is I waste so much time thinking things through, and then > carefully penning responses to see if I can contribute to the debate, > when really I should just toss a coin. I could then reply "I haven't > thought about this, but it's wrong (because it's heads) or it's right > (because it's tails)." You snipped too much and didn't really think at all about what I wrote. Go back to my original post, read the _entire_ paragraph of which you quoted part, above, and get back to me on whether you still think I am in the habit of tossing coins. Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Mon Feb 22 02:20:57 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:17 2004 Subject: layering specs In-Reply-To: <0003446518ff95cb_mailit@mail.iname.com> References: <14032.9707.259533.856607@localhost.localdomain> <14032.9707.259533.856607@localhost.localdomain> Message-ID: <199902220010.TAA17594@hesketh.net> Rob Schoening writes: >This has all the makings of a written langauge that is too difficult to >speak because the ideosycrasies of the written grammer gum up the "higher" >levels. > >This is why no one speaks latin. This is why it was so difficult for >programmers to communicate with each other without switching to another >langauge altogether (like UML). > >This layered approach to XML is burying its potential. I hope you just mean this particular (mis)layering, not layered approaches in general. I'm hoping to finish up my essay "Toward A Layered Model for XML" this week - the rough draft is still at http://www.simonstl.com/articles/layering/layered.htm. I've had lots of hits on it, but very few comments back. If anyone thinks layering specs and layering processing is a good or bad idea, please let me know. (Off list if it's only appropriate to the essay, on list if you feel it connects to important XML development issues with a wide audience.) Never really tried to speak Latin, though my brother did for one fun summer in Rome... Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Mon Feb 22 02:22:11 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:09:18 2004 Subject: Documents and Document Fragments (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) Message-ID: <004801be5e09$7efbcb60$0300000a@othniel.cygnus.uwa.edu.au> Mark and I agree on and are both excited about the value of the concepts we are discussing: namely promoting an element in a document to the status of document element in its very own document. I'm also interested in the reverse: demoting a document element to the status of a normal element in a larger document. Our disagreement seems to stem from the fact that I don't believe you have an "XML document" until you serialise as well-formed XML text. That's my understanding of the XML 1.0 REC. Mark uses the terms "physical" and "logical" XML documents where, by the former I think he means what I think of as an XML document in the sense of the XML 1.0 REC, i.e. serialised text. By the latter I think he means a more abstract representation of the type being developed by the XML Infoset WG. In fairness to Mark, the terms "physical" and "logical" are used in this way in the Infoset Requirements. However, I would argue that the term "XML document", at least as used in the XML 1.0 REC, is only ever "physical". There is an equivalent logical representation but that is yet to be standardised by the Infoset WG. Most people probably think I am just being pedantic. I am. I'm also trying to follow the spec :-) Mark Birbeck: >I put apostrophes around all >sorts of key words (like the word contains) to emphasise that I am not >trying to be literal. My meaning was that an element within a document >can itself be treated as a document - and still fit with the spec, >despite what you say next. Oh I agree with you. But only because you say "can itself be *treated* as a document". [...] >I was saying that an element is equivalent to a well-formed document >(with an empty prolog) and this gives us certain advantages. Again I agree, because you say "is *equivalent* to". My only point was that we (and I don't mean you in particular?sorry if I appeared to single you out) need to be careful with the term "XML document" because it means something quite specific in the XML 1.0 REC. You are absolutely right that any element in a well-formed document can be treated as a well-formed document (assuming no entity references, etc) but while it is an element within a well-formed document it is not itself an XML document. [example of objects in your database] >[I spell this out because if I show it with tags, everyone will think I >am referring to one *physical* XML document, which I am not.] Well, you aren't referring to an XML document at all in the XML 1.0 REC sense until you serialise it (or part of it) as an XML document. >then I can export a 'proper' XML document from this, such as: [part of database object serialised as XML] >as well as: [all of database object serialised as XML] >or even the 'proper' document: [a single element serialised from the database] >All of these are well-formed 'documents' in the logical sense but have >no relationship to a physical document of any form. I think my problem may not be so much the use of "document" as the use of "logical" and "physical". You are using (quite correctly in the general sense of the terms) "physical" to mean XML "text" and "logical" to mean the abstract data (perhaps in some database). But the XML spec doesn't talk like this. An object representation of an XML document is not an XML document according to the spec. For something to be XML (unparsed entities excepted) it must be represented as text with markup and character data. To stress again, I agree with everything you are saying, just pointing out that use of certain terms in the spec is more specific. > Of course, if all of >your documents (logical) are stored as text files (physical), or to put >it another way, if there is a one-to-one mapping between your physical >and logical XML documents, then none of this is of any use to you; XML documents = text files. What you are calling "logical XML documents" aren't "XML documents" in the sense of the XML 1.0 REC. I'm not arguing about the value of what you are talking about doing. I think it's the way to go. I am just trying to be careful with the terminology. >On the other hand, if you have no documents, >but thousands of nodes of data in a database that you can export and >query, then the difference between a logical document and a physical one >is key. (Further, you could also generate an inline DTD from your schema >as the prolog to each document, if you wanted. Or just point to an >external one.) Yep. All exciting stuff. Keep a logical document or documents in a database and export parts of documents or aggregates of documents as XML. >I pointed out that all this fits with the XML 1.0 notion of a logical >document, in order to stress that we don't need some other terms >inventing to cope with these concepts. Does the XML 1.0 REC really have a notion of a logical document? It has a notion of text (what you are calling a physical document) having a logical structure. It is the main point of the XML Infoset to introduce the notion of a logical document. >The fact that the three examples >I gave above are all subsets of a greater whole, does not in any way >affect that they are all still perfectly acceptable XML documents. They have the potential to be, if serialised as such. > We >don't then need to go back to the original data and say that because we >can get many documents from a bigger document, that document must >therefore be referred to as an 'uberdocument' Hang on. I'm not suggesting anyone *has* to use my word! :-) I coined the term ?berdocument originally to mean a hierarchy of XML documents that are treated as if they were a single XML document. To actually be serialised as a single XML document, one would have to handle localised declarations and name clashes (namespaces to the rescue!). I made up a new word because I essentially wanted to say "these things aren't documents in the sense in which one would normally think of them. An ?berdocument is an over-arching document representing an entire collection of documents". >> Yep. This is the idea I'm exploring. I'm just using the term >> "?berdocument" >> for the "one massive document". > >But it's still a document (logical), just like the other three. Yes. It's a logical document in the sense you've been using the term. It has the potential to be serialised as a single XML document. > And >equally, we don't really need to say that because those three documents >came from a greater document they must be 'document fragments'. (I say >'don't really', because there are situations such as getting a parser to >select part of a *physical* document, when the term 'fragment' might be >useful.) And likewise I think there are situations where you want to get a parser to treat an XML document (physical document in the sense you've been using the term) as part of a larger document: the ?berdocument. >To conclude, there's nothing wrong with introducing new terms, but I >feel that they must clarify something, or point towards something that >has not been addressed before. But as far as I can see, all of the >concepts we need to cope with the idea of an 'XML document server', >etc., *are* present in XML 1.0. Not in the XML 1.0 REC. That's why the Infoset Set work is being done. We actually agree on the concepts and their value. I am just being pedantic about using words as they are meant in the XML 1.0 REC. James -- James Tauber / jtauber@jtauber.com / www.jtauber.com Associate Researcher, Electronic Commerce Network Curtin University of Technology, Perth, Western Australia Full-day XML Tutorial @ WWW8 : http://www8.org/ Maintainer of : www.xmlinfo.com, www.xmlsoftware.com and www.schema.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rschoening at unforgettable.com Mon Feb 22 02:23:35 1999 From: rschoening at unforgettable.com (Rob Schoening) Date: Mon Jun 7 17:09:18 2004 Subject: Announcement: XML Infoset Requirements Published In-Reply-To: <14032.26223.790080.443871@localhost.localdomain> References: <14032.26223.790080.443871@localhost.localdomain> Message-ID: <0003446aa9c421a3_mailit@mail.iname.com> > > This layered approach to XML is burying its potential. > >I'm having a lot of trouble following the argument -- perhaps a simple >summary of the main points would help. Do you believe that layering >is a problem because it's hard to keep track of or manage the >different layers, or do you believe that XML should not be the base >layer? Some of both. I worry that the stack model is deceptive in that it masks the degree of complexity that each spec adds to the entire system. 1) Putting the XML spec at the bottom presupposes that the markup is the essential characteristic of XML. I would argue that the essential characteristic of XML--what gives it so much potential--is the consistent data model that is shared between the XML spec, DOM, Sax, RDF, etc. I don't see how the XML spec proper is somehow "more essential" to the system than anything else. Granted, it is much easier to formalize a language than an abstract data model, but I see that as a poor objective justification for elevating its status. The issues at hand are very pragmatic. The success or failure of XML will not turn on the academic aspects of the langauge spec. It will turn on the degree to which the system as a whole is able to tame the complexity (and thus costs!!) of various information systems. If history is any indication, that complexty (and cost) will be manifest in the software required to process and manage the data. To that end, it seems naive to assume that decisions can be made at the lowest levels without regard to their impact at the higher levels. 2) OSI-style protocol stacks work (or don't work) because the number of interfaces are directly proportional to the number of layers. Ideally seven layers yeilds six interfaces. XML isn't so neat. This is manifest in the discontent over namespaces. If the namespaces interface was with the XML spec only, I doubt that there would be so much complaining. But ultimately if we're to keep the XML camp together, all of these technologies need to be kept in sync. This is HARD! If N is the number of XML related specs, the number of related interfaces will be N!/2. This is going to get really complex, really fast. More than anything, we need a *name* for the system that is comprised of all of the specs discussed on this list. Java is a good example. The java language spec, the virtual machine spec, and the APIs, and the various implementations all fall under the banner "Java". This subtle organization has been instrumental in its success. Microsoft tried to claim that Java was just the java language spec, but Sun has arranged things so that that it is clear to the objective observer that this is not the case. More importantly, the Java banner has ensured that Java ISVs have gone in a consistent direction. You might not agree with what Sun is doing, but they're doing a hell of good job doing it. XML is not heading down the same path. I'd like to see the following: 1) A rallying point for XML related technologies. The psychology of a name is significant. If this list was called XYZ-dev, I think that we'd all be better off. I know a lot of people will think this is frivolous, but I hope there is some thought put into the issue. I'm afraid that XML is going to become, in Bill Gates' words, *just* a markup language. (Please, don't respond by telling me that XML is just a markup language because that is my point.) 2) Some formal analysis of the overall system. At the very least, this could be a set of hypothetical use-cases for the various XML technologies. There is quite a lot of "XML is for ____" commentary on this list these days. Arguing these points in detail is bound to go nowhere. The fact of the matter is that people are going to use XML for all kinds of different things. Of course it does not make sense for all of these potential uses to become guiding principles for the XML initiative. However, without some guidance, these differences will begin to manifest themselves in parochial initiatives (ABC spec, DEF spec, etc) that will be mutually incompatible and ultimately fracture the overall effort. As you can see, these are *not* technical concerns. I think that it is imperitave that we realize that the success and failure of technology rarely turns on objectively technological issues. Focusing on the shortcomings of XML namespaces misses the salient issues entirely. The litany of woes will continue unless something is changed. Last month it was namespaces, now it is the infoset, next month it is going to be something else. Let's go after the root of these problems. Rob xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Mon Feb 22 02:24:58 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:09:18 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 Message-ID: <002a01be5dfb$54b560e0$0300000a@othniel.cygnus.uwa.edu.au> Don Park: >The fact that DTD had to be declared at the head of a document and could not >be changed or augmented afterward poses problems but I am hoping that the >Schema WG will address this soon. A couple of weeks ago I suggested that one thing I'd like to see tried out is localised declaration, perhaps by an attribute value giving the URI of an external parameter entity to use. Because the scope of such a declaration would be the element, it doesn't address the situation where you want to say "from here on in use these declarations". Any ideas? James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Mon Feb 22 02:25:05 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:09:18 2004 Subject: Latin (was Re: Announcement: XML Infoset Requirements Published) Message-ID: <002b01be5dfb$558f9440$0300000a@othniel.cygnus.uwa.edu.au> >This has all the makings of a written langauge that is too difficult to >speak because the ideosycrasies of the written grammer gum up the "higher" >levels. > >This is why no one speaks latin. Not wanting to get into too much of a discussion on natural language, but the reason no one speaks latin is because it is now called French. It has nothing to do with ideosyncrasies of the written grammar making it too difficult to speak. The Romans had no problem with it. Latin just evolved. Same with any other language. James :-) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From sunny at comax.com.tw Mon Feb 22 02:29:40 1999 From: sunny at comax.com.tw (Sunny Sun) Date: Mon Jun 7 17:09:18 2004 Subject: test Message-ID: <000b01be5e0a$964ac900$b7cb3ed2@multipost2000.comax.com.tw> test xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ralph at fsc.fujitsu.com Mon Feb 22 03:03:00 1999 From: ralph at fsc.fujitsu.com (Ralph Ferris) Date: Mon Jun 7 17:09:18 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 In-Reply-To: <199902201346.IAA17027@hesketh.net> References: <014501be5ca6$9c02ac60$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: <3.0.5.32.19990221185833.00a8ac70@pophost.fsc.fujitsu.com> At 08:49 AM 2/20/99 -0500, Simon St.Laurent wrote: >The delays in tools >for managing, creating, linking, and presenting XML documents have left XML >without very much to do for documents - presenting unlinked documents in >beta viewers isn't especially exciting, and so far XML hasn't made much of >a dent on its original claim to be 'SGML for the Web'. Some of us have been acutely aware of this issue for quite a while. My recent post on xlxp-dev, which I quote relevant parts of below, was intended as a reminder that some people are still very much at work in this area. A question to ask though, is why haven't we seen more of the tools required to bring SGML to the Web? To answer it, reflect on the parade of "standards" and "specifications" developed over the past 10 years to view or print a document. Let's see: - ISO began work on DSSSL back in the late '80s/early '90s. - DSSSL was taking too long, so the U.S. DoD developed "Formatted Output Specification Instances" (FOSIs), as a "stop gap." - FOSI's are supported by ArborText. But FOSIs are pretty complex, so SoftQuad developed a stylesheet specification of its own when it came out with Panorama. - ISO finally made DSSSL a standard in 1996. - We (Fujitsu) brought out HyBrick, which supports DSSSL (or DSSSL-Online to be more exact), at the end of that same year. - A year later we were being asked, "when are you going to support XSL?" (Answer: When it stops moving and we think we need to.) And how are documents actually distributed? Even by the W3C working groups on XML? In HTML ... sometimes with CSS stylesheets. Any wonder then that we haven't seen more of "SGML/XML on the Web"? Now for a more optimistic note, the following is from my post of last week on xlxp-dev: ************************************************************************** I've put some files with XLink/XPointer declarations in them up on the HyBrick Web site at http://www.fsc.fujitsu.com/hybrick/. These files are intended to be accessed over the Web. In the original announcement of HyBrick V0.80, I down played HyBrick's Web capabilities. That's because this version - the one that's currently available - does not support proxy servers. So file retrieval over the Web using HyBrick is at the moment problematic. If your network access environment allows you to though, you can see XLink and XPointer at work over the Web by downloading HyBrick and pointing it at: http://www.fsc.fujitsu.com/hybrick/hubdoc-1.xml HyBrick doesn't currently have a download progress indicator. If something is happening though, you'll see the status message "Parsing and FOT-Building" in the window frame. To repeat, HyBrick does not currently support proxy servers, so this may not work from your location. Work is now going on to address this issue and to make HyBrick technology more available in general. ***************************************************************************** Best regards, Ralph E. Ferris Fujitsu Software Corporation xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Mon Feb 22 03:26:27 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:09:18 2004 Subject: Streams, protocols, documents and fragments was: RE: Documents and Document Fragments (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) In-Reply-To: <004801be5e09$7efbcb60$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: <001b01be5e12$66f1d6a0$d3228018@jabr.ne.mediaone.net> James Tauber wrote: > > Mark and I agree on and are both excited about the value of the > concepts we > are discussing: namely promoting an element in a document to the status of > document element in its very own document. > > I'm also interested in the reverse: demoting a document element to the > status of a normal element in a larger document. > > Our disagreement seems to stem from the fact that I don't believe you have > an "XML document" until you serialise as well-formed XML text. That's my > understanding of the XML 1.0 REC. > > Mark uses the terms "physical" and "logical" XML documents where, by the > former I think he means what I think of as an XML document in the sense of > the XML 1.0 REC, i.e. serialised text. By the latter I think he > means a more > abstract representation of the type being developed by the XML Infoset WG. > > In fairness to Mark, the terms "physical" and "logical" are used > in this way > in the Infoset Requirements. However, I would argue that the term "XML > document", at least as used in the XML 1.0 REC, is only ever "physical". > There is an equivalent logical representation but that is yet to be > standardised by the Infoset WG. > > Most people probably think I am just being pedantic. > > I am. > > I'm also trying to follow the spec :-) > > I think this whole discussion is getting muddled because terminology of different domains is being interchanged. Some definitions: document is defined as in the XML spec. documents are well formed. when a document fragment is isolated from its parent document, it becomes a standalone document. a document may contain a prolog. a document fragment may not. a document may contain a !DOCTYPE definition (DTD), a document fragment may not. Hence all document fragments are legal documents but not all documents are legal document fragments. stream is ambiguous but generally refers to a series of bits or bytes or characters. In general, a stream behaves similarly to a socket. protocol is layered above a network transport, or socket and defines a mutually agreed upon mechanism to exchange messages and other data. So what does this have to do with XML? The canconical example of streamed XML is the stock ticker. Assuming each stock quote is transmitted in a document, the HTTP protocol can employ a particular URL e.g., http://wherever/quotes/next to return the next quote as a single document. Suppose we wish to transmit 100 quotes as distinct documents, this does not work with HTTP which returns a single MIME message response for each request. The solutions would be to employ 1) multipart messages 2) wrap the quotes in a single document 3) use another protocol. Suppose we use raw sockets? Nothing to prevent sending one document after another down the socket. The end of one document and the start of another are unambigous assuming the documents are well-formed. So, the problem here is not one with XML, rather the protocol used to transmit documents, HTTP and SMTP send one MIME message per PDU, streaming protocols can be defined which transmit multiple documents. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Mon Feb 22 03:48:07 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:18 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 Message-ID: <001a01be5e15$aa9b6f30$2ee044c6@arcot-main> >A couple of weeks ago I suggested that one thing I'd like to see tried out >is localised declaration, perhaps by an attribute value giving the URI of an >external parameter entity to use. Because the scope of such a declaration >would be the element, it doesn't address the situation where you want to say >"from here on in use these declarations". Why not have "from here on use these declarations" as the default behavior and then introduce 'push' and 'pop' constructs? 'push' would save current declaration settings and the 'pop' would just restore to the saved settings. Don xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Mon Feb 22 03:55:07 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:09:18 2004 Subject: Streams, protocols, documents and fragments was: RE: Document s and Document Fragments (Was RE: XML Information Set Requirements, W3C N ote 18-February-1999) Message-ID: Jonathan Borden wrote: [after using stock ticker example] > So, the problem here is not one with XML, rather the protocol used to > transmit documents, HTTP and SMTP send one MIME message per > PDU, streaming > protocols can be defined which transmit multiple documents. Succinctly put. Less sure later on though. I'm OK here: > document is defined as in the XML spec. > documents are well > formed. Yes. Important to stress because it makes the 'never-closing stream' a non-starter. > when a document fragment is isolated from its parent > document, it > becomes a standalone document. > > a document may contain a prolog. a document fragment may not. > a document may > contain a !DOCTYPE definition (DTD), a document fragment may > not. Hence all > document fragments are legal documents but not all documents are legal > document fragments. The term 'document fragment' is not in XML 1.0, and my point was that we don't need new terminology - uberdocuments, document fragments, and so on - to understand these concepts. All you have said is that an XML document can have a prolog ... or not. If you give me a well-formed 'XML document' I have no way of knowing where that came from. It could be a standalone text file, or it could be a node from a larger XML document, but where it came from isn't going to help it; it will stand or fall on its own merit - i.e., is it well-formed? So why confuse things with all these different notions? Regards, Mark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Mon Feb 22 06:29:20 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:09:18 2004 Subject: XML Information Set Requirements, W3C Note 18-February-1999 References: <001a01be5cef$3f94ad40$c9a8a8c0@thing2> <36D05C94.C772B698@prescod.net> Message-ID: <36D0F857.94C4A8AB@manhattanproject.com> Paul Prescod wrote: | I don't believe that the W3C has forgotten about stream processing. One of | the more controversial parts of the XML namespace specification is | intimately tied to stream processing (local namespaces). I think that I | can safely say that when XML was being developed streaming uses were as | high in the minds of the working group as tree-based uses. Stream based | processing has always been more common in the SGML world than tree | processing. Okay then, why are the DOM and XSL tree based? Well, the web | infrastructure favors small documents inherently. Large streams must be | broken up on the server side for performance reasons. Bandwidth, not RAM, | is the limiting factor in Web user interfaces. This clears a great deal up for me. Thank you. I didn't see the direct relationship between usage of XML for stream processing and local name spaces. Thus, a similar controversy would arise if one proposed local architectures? *evil grin* Jeff Sussna write: | If you approach XML as a type system, the concept of document loses | its first-class status (or at least should, in my opinion). I think I agree with this. Please correct me, but with Property Sets, each node has a link directly to the 'document root'. I see this as something which deserves consideration (among other things) when the Infomation Set is defined. Perhaps it can look like this: BaseInfoSetItem { // stuff common to both stream and object representations } TreeInfoSetNode public BaseInfoSetItem { // extra stuff that you get for free when the representation // is an in-memory graph, database wrapper, or some other // complete object with random access. TreeInfoSetNode *DocumentRoot(); } EventInfoSetStack public BaseInfoSetItem { // extra stuff relevant when you have an event based, stack // representation of the information in question. This would // be the same as the 'visitor' interface for the TreeInfoSet? } Hmm. Just meandering... Rick Jelliffe replied to Jeff's note: | XML is not a type system. A document is a graph of elements, data, | comments and PIs with | * an ID namespace | * optionally some element type declarations | * optionally some entity declarations and notation declarations | * optionally namespace declarations which allow local type names to | be qualified by a URI | | In other words, the document is the block mechanism for metadata | and namespaces for a subtree of the entire hyper-document. Hmm. Perhaps this would be a good starting place for the 'definition' of a document. Thus, would it be fair to say that a document is analogous to a database transaction? If so, then my question becomes: How can I express nested blocks? | XML is a labelling notation, not a type system. I'm not sure I get the distinction. When you label something arn't you in effect classifying it, i.e., giving it a type, and, isn't a label required to identify type? | If the document loses its first-class status, which of these things | should be gotten rid of? Do you want arbitrary scoping of IDs, element | type declarations, entity declarations, notation declarations and | namespaces? If so, you need some block mechanism to allow these. Hmm. Well, I see a stream based system having a stack. Thus, each beginning something puts the element on the stack, and each pops the stack. Thus, I see and as my block mechanisms. Is this too niave? Don Park wrote: | Why not have "from here on use these declarations" as the | default behavior and then introduce 'push' and 'pop' constructs? | 'push' would save current declaration settings and the 'pop' would | just restore to the saved settings. Mabye I'm not getting it, but and provide the push/pop mechanism automagically. The only things that have problems is those "from here on" things. It's unfortunate that SGML backwards compatibility dictate that this is the default behavior... and thus can't provide the push/pop mechanism? *dazed* I think I need to go back and do more real-world hacking, I'm starting to get a better felling for XML. Sorry for butting in the conversation again. :) Clark Evans xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Tim.Shaw at wdr.com Mon Feb 22 09:39:42 1999 From: Tim.Shaw at wdr.com (Tim.Shaw@wdr.com) Date: Mon Jun 7 17:09:18 2004 Subject: Streaming XML? [was Re: XML Information Set Requirements..] In-Reply-To: <36CDE070.3EB1A411@locke.ccil.org> Message-ID: This was one of my initial concerns on the DOM - I too will be working on ticker-type streams. I nearly jumped in early in this conversation to agree with the 'others' - ie the SAX approach is just as important as DOM and the 2 should both be of similar (standards) status. I held off because I'm new to XML, and I wasn't sure if my alternative wasn't too obvious to mention. I thought that each tick should be treated as a document. The client would have to arrange that each 'document' so received would have to be built, and then a document fragment extracted and inserted into a client-held document (updating display etc. via DOM events as appropriate). This approach requires the client of the XML data-source to be aware of this of course, which may be a killer for genericity. Am I way off base here - and if so, could someone point me to current thinking on this issue (I've browsed _a lot_ and seen little/nothing). Thanks tim ______________________________ Reply Separator _________________________________ Subject: Re: XML Information Set Requirements, W3C Note 18-February Author: cowan (cowan@locke.ccil.org) at unix,mime Date: 19/02/99 22:06 Nathan Kurz wrote: > And if the stream is continuous (for example, an XML > stock ticker) even the concept of a well-formed stream seems tenuous. It's not clear that XML supports infinitely long streams (where the end-tag of the document element is *never* reached). -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981 -02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From shecter at darmstadt.gmd.de Mon Feb 22 11:46:39 1999 From: shecter at darmstadt.gmd.de (Robb Shecter) Date: Mon Jun 7 17:09:18 2004 Subject: Stateful http xml middleware w/ servlets? Message-ID: <36D1438B.E6E79C0D@darmstadt.gmd.de> Hi, Has anyone done the above? Making a bus-architecture xml http-based middleware using servlets is "easy". But then you haven't got any state like you would with Corba, RMI, etc. But what about using the Session capabilities of the servlet libraries? Maybe with a proxy object on the client side that would represent the server & session to the client, and always include the session id when talking to the server? Information producers on the server side would be given session data with each invocation. Just thinking out loud here... - Robb xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elharo at metalab.unc.edu Mon Feb 22 15:11:05 1999 From: elharo at metalab.unc.edu (Elliotte Rusty Harold) Date: Mon Jun 7 17:09:18 2004 Subject: Latin (was Re: Announcement: XML Infoset Requirements Published) In-Reply-To: <002b01be5dfb$558f9440$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: At 8:36 AM +0800 2/22/99, James Tauber wrote: >Not wanting to get into too much of a discussion on natural language, but >the reason no one speaks latin is because it is now called French. > French is Latin spooken by a drunk Roman soldier. +-----------------------+------------------------+-------------------+ | Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer | +-----------------------+------------------------+-------------------+ | XML: Extensible Markup Language (IDG Books 1998) | | http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ | +----------------------------------+---------------------------------+ | Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ | | Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ | +----------------------------------+---------------------------------+ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Feb 22 15:33:42 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:19 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) References: Message-ID: <36D178B1.89F7EB3@locke.ccil.org> Mark Birbeck wrote: [example snipped] > No-one so far in the discussion has argued that this is good XML - I so argue. It is well-formed, though not valid, XML. Validity inherently can't be checked until you've processed everything. It might be interesting to define the subset of validity that can be checked on the fly, though. My first cut at it says that all VCs except the following can be checked given the full left context (in stream terms, all that has come before): [32] Standalone Document Declaration [56] IDREF [56] Entity Name (detectable at end of DTD) [58] Notation Attributes (detectable at end of DTD) Have I overlooked anything? > My contribution to the discussion - which I *did* give much thought - > was to try and argue that it is not very good programming practice > anyway, to open a stream for 8 hours. I agree with this. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Feb 22 17:05:09 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:19 2004 Subject: More XML Identity Angst In-Reply-To: <0003446aa9c421a3_mailit@mail.iname.com> References: <14032.26223.790080.443871@localhost.localdomain> <0003446aa9c421a3_mailit@mail.iname.com> Message-ID: <14033.14911.571452.116212@localhost.localdomain> Rob Schoening writes: > 1) Putting the XML spec at the bottom presupposes that the markup > is the essential characteristic of XML. Markup *is* the essential characteristic of XML -- XML is a markup standard that describes how to represent a hierarchical structure in a linear sequence of characters. XML is *not* a complete system design, a Golden Hammer, or an investor-appeal buzz-word. Markup is not (necessarily) the essential characteristic of an information system, a text repository, e-commerce, metadata-exchange, browsing, or anything else. XML can describe the part of the solution (if any) that *does* have to do with markup; there are other terms to describe other parts of the solution. > I would argue that the essential characteristic of XML--what gives > it so much potential--is the consistent data model that is shared > between the XML spec, DOM, Sax, RDF, etc. Yes, that model will be very exciting, it has enormous applicability, and I believe that most application designers will work from the abstractions in the model (often, as reflected through an interface) rather than directly with the markup, but still, even if an abstract model is better, that model is *not* XML; it is simply derivable from XML. [snip] > More than anything, we need a *name* for the system that is > comprised of all of the specs discussed on this list. Java is a > good example. The java language spec, the virtual machine spec, > and the APIs, and the various implementations all fall under the > banner "Java". Fortunately, we won't make this mistake (I hope). XML can serialise any type of information (some much more efficiently than others), and most other information standards can use XML as a serialisation format. > This subtle organization has been instrumental in its success. > Microsoft tried to claim that Java was just the java language spec, > but Sun has arranged things so that that it is clear to the > objective observer that this is not the case. More importantly, > the Java banner has ensured that Java ISVs have gone in a > consistent direction. You might not agree with what Sun is doing, > but they're doing a hell of good job doing it. > > XML is not heading down the same path. That path was probably right for Java, being a corporate initiative and all, though it has made life hard for users. For example, why do I have to wait for the whole Java 1.2 port to Linux before I can use (most of) the Java 1.2 class libraries? Does the virtual machine change that much with every library upgrade? You'd think that they'd at least be able to upgrade parts of the system separately. That path doesn't make sense for XML. If XML is ultimately successful (and at this point, I'd bet a lot on its success), it will be everywhere but will be nearly invisible, like IP. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Feb 22 17:43:48 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:19 2004 Subject: Submitting comments on the XML Infoset Requirements Message-ID: <14033.38405.980983.88542@localhost.localdomain> I'd like to thank everyone who has shown interest in the XML Information Set WG's requirements document [1]. If any of you would like your comments to be considered by the working group, you will need to do two things: 1. make certain that the comments are directly applicable to the Infoset RD; and 2. mail the comments to www-xml-infoset-comments@w3.org (please don't cross-post). Thanks, and all the best, David [1] http://www.w3.org/TR/NOTE-xml-infoset-req -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Feb 22 17:51:11 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:19 2004 Subject: SAX and Standards Bodies (was Streaming XML?) In-Reply-To: References: <36CDE070.3EB1A411@locke.ccil.org> Message-ID: <14033.38676.860585.23146@localhost.localdomain> Tim.Shaw@wdr.com writes: > I nearly jumped in early in this conversation to agree with the > 'others' - ie the SAX approach is just as important as DOM and the > 2 should both be of similar (standards) status. Thank you very much for the suggestion (and for the implicit vote of support), but I'm afraid that handing SAX over to a standards body would be its doom. Right now, SAX is an open spec designed in public view on the XML-Dev mailing list -- we all get to see each-others' unpolished, stupid ideas as soon as they emerge, and we knock them back and forth until they look like something. The alternative would be to hand SAX over to a standards body that might or might not set up a working group, that might or might not in a year or two put out a working draft, that almost certainly wouldn't look anything like the SAX we all know today. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at cogsci.ed.ac.uk Mon Feb 22 18:02:17 1999 From: richard at cogsci.ed.ac.uk (Richard Tobin) Date: Mon Jun 7 17:09:19 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) In-Reply-To: John Cowan's message of Mon, 22 Feb 1999 10:33:05 -0500 Message-ID: <199902221700.RAA28820@stevenson.cogsci.ed.ac.uk> > My first cut at it says that all VCs except the following can be > checked given the full left context (in stream terms, all that > has come before): > > [32] Standalone Document Declaration > [56] IDREF > [56] Entity Name (detectable at end of DTD) > [58] Notation Attributes (detectable at end of DTD) This is also not detectable until the end of the DTD: [76] Notation Declared -- Richard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Mon Feb 22 18:23:56 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:09:19 2004 Subject: layering specs Message-ID: <002201be5e90$66772b20$1df96d8c@NT.JELLIFFE.COM.AU> >Never really tried to speak Latin, though my brother did for one fun summer >in Rome... Some people still talk in Latin (mainly to God), so it is rather premature to call it dead; though I certainly grant that if you consider religious expression less important that politics, TV and technology, then Latin is more than a bit whiffy. I have read that Latin was still spoken as a second language last century in Iceland or Greenland; and I think it was the court language for the Austro-Hungarian empire even last century too: I cannot find out from here if this was just for legal and court documents, and whether it was spoken as well as read. The Vatican released a new glossary of Latin expressions for technological terms recently, because they need to translate documents into Latin which may refer to modern artifacts. I wonder what the Latin for XML is? (ling** extensibil** annotat** ??)--any offers? I am not advocating that W3C write future specifications in Latin to gain precision. Rick Jelliffe Non-latin Taipei xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Mon Feb 22 18:26:58 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:09:19 2004 Subject: Confusion about conditional sections Message-ID: <87256720.00651541.00@d53mta03h.boulder.ibm.com> >I'd say you MUST do this. > >Consider this sort of structure, which I found in a version of >the XML Docbook DTD: > > > ]]> > > ]]> > >That is, if you tried to parse the contents of the second conditional >section and something like the first one hadn't been parsed, you would >be in undeserved trouble. That makes sense, yes. I guess that's argument enough that it's the only way to do it. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wunder at infoseek.com Mon Feb 22 18:33:01 1999 From: wunder at infoseek.com (Walter Underwood) Date: Mon Jun 7 17:09:19 2004 Subject: layering specs In-Reply-To: <199902220010.TAA17594@hesketh.net> References: <0003446518ff95cb_mailit@mail.iname.com> <14032.9707.259533.856607@localhost.localdomain> <14032.9707.259533.856607@localhost.localdomain> Message-ID: <3.0.5.32.19990222092459.00b851c0@corp> At 07:13 PM 2/21/99 -0500, Simon St.Laurent wrote: > >If anyone thinks layering specs and layering processing is a good or bad >idea, please let me know. Be sure to read RFC 817, "Modularity and Efficiency in Protocol Implementation". The short, short version of that paper is that layering in specs helps clarity and that layering in implementation hurts speed. There are many copies around, here is a pretty one: http://www.andrew2.andrew.cmu.edu/rfc/rfc817.html wunder -- Walter R. Underwood wunder@infoseek.com wunder@best.com (home) http://software.infoseek.com/cce/ (my product) http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Kay at icl.com Mon Feb 22 18:46:54 1999 From: Michael.Kay at icl.com (Michael.Kay@icl.com) Date: Mon Jun 7 17:09:19 2004 Subject: Validation Conditions (was Streaming XML (was ...)) Message-ID: <93CB64052F94D211BC5D0010A80013310EB327@wwmessd3.bra01.icl.co.uk> > > My first cut at it says that all VCs except the following can be > > checked given the full left context (in stream terms, all that > > has come before): Which reminds me that SAX still needs some kind of statement about when errors are detected and reported. If we ask a ModSAX parser to do validation, do we get a guarantee that a duplicate ID attribute will be reported when the first duplicate value is encountered, or is the parser allowed to wait until end of document? Mike Kay xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Mon Feb 22 19:00:50 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:09:19 2004 Subject: Well-formed vs. valid Message-ID: <000401be5e95$4e879900$5118a8c0@kuantech1.quokka.com> In spit of my recent editorial about markup vs. type systems, I agree with many of the "opposing" statements that have been made. In particular, I agree with the statement about XML 1.0 as a stable base. The base XML spec is really the only stable part of the family at this point, and we should not in fact f*ck with it. One thing disturbs me, however. Much talk seems to be made about documents or document fragments being useful because they are well-formed. I don't want something well-formed, I want something "valid". Whether validity is determined by reference to a DTD or to a schema of some other kind, I need more than just the lowest-level syntactic conformance to the XML spec. I need to be able to determine that the XML in question conforms to the syntactic and semantic constraints imposed by my application. Furthermore, I don't want to have to rely on implicit knowledge contained within a proprietary parser in order to do so. Jeff ----------------------------------------------------------------- Kuantech, Inc. http://www.kuantech.com Jeffrey E. Sussna, Principal jes@kuantech.com Distributed Content Architectures for Dynamic Online Applications ----------------------------------------------------------------- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Mon Feb 22 19:09:00 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:19 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) In-Reply-To: <199902221700.RAA28820@stevenson.cogsci.ed.ac.uk> References: <199902221700.RAA28820@stevenson.cogsci.ed.ac.uk> Message-ID: <14033.43611.829978.187099@localhost.localdomain> Richard Tobin writes: > > My first cut at it says that all VCs except the following can be > > checked given the full left context (in stream terms, all that > > has come before): > > > > [32] Standalone Document Declaration > > [56] IDREF > > [56] Entity Name (detectable at end of DTD) > > [58] Notation Attributes (detectable at end of DTD) > > This is also not detectable until the end of the DTD: > > [76] Notation Declared Yes, but since the DTD precedes the start of the document element, in practice this constraint would not affect streaming. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Feb 22 19:16:00 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:19 2004 Subject: Latin (was Re: Announcement: XML Infoset Requirements Published) References: Message-ID: <36D19143.34DE4EAE@locke.ccil.org> Elliotte Rusty Harold scripsit: > French is Latin spooken by a drunk Roman soldier. Added. For 181 more remarks like this, see http://www.ccil.org/~cowan/essential.html . -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Mon Feb 22 19:15:58 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:19 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) References: <199902221700.RAA28820@stevenson.cogsci.ed.ac.uk> Message-ID: <36D19341.FCBD349D@locke.ccil.org> It was pointed out to me privately that the VC [76] Notation Declared also cannot be detected till the end of the DTD. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Mon Feb 22 19:19:25 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:09:20 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) Message-ID: John Cowan wrote: > Mark Birbeck wrote: > > The question is whether it is legitimate to open a stream > > of XML with some sort of element like: > > > > > > > > and then spend the rest of the day sending out things like: > > > > > > MSFT > > 1000 > > > > > > and then at the end of the day, sending: > > > > > > No-one so far in the discussion has argued that this is good XML - > > I so argue. It is well-formed, though not valid, XML. It's NOT well-formed until the end of the day, when you receive the closing tag. Until that time 'stockPrices' is not a complete element, and therefore not a complete XML document. > Validity inherently can't be checked until you've processed everything. Nor can well-formedness. > It might be interesting to define the subset of validity that can > be checked on the fly, though. I think you're barking up the wrong tree here. This violates the whole basis of XML - and is completely unnecessary anyway. If I receive an element that completely matches the DTD requirements for that element, and then my processor acts on it - say, by updating a stock control system - and then I receive another element of the same type, but the parent element has a DTD entry that says it can have only one node of this type, then my document is invalid for that DTD - even though the 'subset' is valid. What does your processor do now? Undo the stock update and roll back to the previously nested 'state of validity'? Abort the entire document undoing everything on the way? Cry? A consistent theme in this discussion forum is that people always want every major breakthrough that has been made by XML to be removed, under the pretence of coping with some 'special circumstances'. If you think about it, it is quite unique in the history of software engineering to have an agreed standard which allows us to check whether a document that we had no part in designing the layout for, is valid. In the past if I wanted to check the validity of a Word or Excel document, or Director shows, or Zip files, I'd need to know their internal - and proprietary - format in advance. And if Microsoft, or Macromedia, or whoever, changed them, I'd need to change my software. Yet with XML all you have to do is put in the head of your document what DTD you are using and I can validate it. Now everyone wants to throw that away and have open-ended, infinite documents that are made up of smaller subsets of validity! We could already do that before XML 1.0 - it's not difficult! > My first cut at it says that all VCs except the following can be > checked given the full left context (in stream terms, all that > has come before): > > [32] Standalone Document Declaration > [56] IDREF > [56] Entity Name (detectable at end of DTD) > [58] Notation Attributes (detectable at end of DTD) > > Have I overlooked anything? Er, well, yeah ... [1]!! Regards, Mark Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Feb 22 19:26:35 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:20 2004 Subject: Well-formed vs. valid Message-ID: <3.0.32.19990222111848.00bda3a0@pop.intergate.bc.ca> At 10:58 AM 2/22/99 -0800, Jeffrey E. Sussna wrote: >One thing disturbs me, however. Much talk seems to be made about documents >or document fragments being useful because they are well-formed. I don't >want something well-formed, I want something "valid". Whether validity is >determined by reference to a DTD or to a schema of some other kind, I need >more than just the lowest-level syntactic conformance to the XML spec. I >need to be able to determine that the XML in question conforms to the >syntactic and semantic constraints imposed by my application. I've never seen an application so simple that its syntactic/semantic constraints could be expressed in a schema, DTD or any other flavor. That's why every commercial DBMS-based app has zillions of lines of data validation code that have to be run before you actually use incoming data. Having said that, I think that validation is a good thing and essential in lots of applications, and will become a better thing once we have a more modern schema facility. >Furthermore, I don't want to have to rely on implicit knowledge contained >within a proprietary parser in order to do so. In my experience, you *always* have to write some application-specific validation code. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From begeddov at jfinity.com Mon Feb 22 19:33:06 1999 From: begeddov at jfinity.com (Gabe Beged-Dov) Date: Mon Jun 7 17:09:20 2004 Subject: events vs callbacks (was Re: SAX2 (was Re: DOM vs. SAX??? Nah. )) References: <000601be575c$82864d40$c9a8a8c0@thing2> Message-ID: <36D1AC37.4E5964B4@jfinity.com> I was thinking about SAX2 and Bill Megginsons' message about the to and fro that makes open development preferable to "standards" development. I think it is very valid and so I decided to throw out my unpolished ideas :-) I have been struggling with control flow integration vis-a-vis XML parsers. I came up with some thoughts and looked back over the archives and saw that Bill La Forge had voiced similar ideas 10 days ago. Here are the relavent excerpts Bill la Forge wrote: > For SAX2, it would be great to pass objects representing SAX events instead > of method calls. The overhead might not be any greater, as the parser could > just have one of each kind of event and reuse them. > > Backward compatibility could be achieved through the use of a conversion > filter, allowing existing SAX applications to work with new parsers. > > There are two big advantages here in terms of extensibility: > 1. It would be easy to extend the interfaces for various SAX event objects, > passing additional data without creating problems for an application which > is not expecting it. > 2. Additional events could be passed which the application could ignore. Bill voices several advantages for events. If I can paraphrase, the general advantage is that once you "reify" the information about the event (that is currently the parameters of the DocumentHandler callback, you can take advantage of all the usual OO capabilities. I agree with this but I see a different advantage to making the events first class objects rather than implicit in the callback parameter list. That advantage is decoupling control flow. If someone wants to use the default control flow policy (as currently implemented) they can immediately dispatch the event using the parser's thread of control. If they want a stream based application processor that has its own thread of control, they can push the events onto an event queue. Right now, if I want to have two threads of control, one for the parser, and one for the subsystem that talks to the parser, my path of least resistance is to have the parser write a tree and then process it. I'd like to have two threads and not wait for the whole tree to be processed. Obviously this is not a big deal but it becomes trivial if SAX generates events directly. As far as the question of whether to layer a callback interface on events or events on a callback interface, the OO arguments that Bill makes argue for the former. Gabe Beged-Dov www.jfinity.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Mon Feb 22 19:45:16 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:09:20 2004 Subject: Well-formed vs. valid In-Reply-To: <3.0.32.19990222111848.00bda3a0@pop.intergate.bc.ca> Message-ID: <000601be5e9b$80b15640$5118a8c0@kuantech1.quokka.com> Tim, You are correct. But strictly typed languages were invented for a reason, even though they can't prevent bugs. Well-formedness just seems too lcd. Jeff -----Original Message----- From: Tim Bray [mailto:tbray@textuality.com] Sent: Monday, February 22, 1999 11:21 AM To: Jeffrey E. Sussna; 'XML-DEV' Subject: Re: Well-formed vs. valid At 10:58 AM 2/22/99 -0800, Jeffrey E. Sussna wrote: >One thing disturbs me, however. Much talk seems to be made about documents >or document fragments being useful because they are well-formed. I don't >want something well-formed, I want something "valid". Whether validity is >determined by reference to a DTD or to a schema of some other kind, I need >more than just the lowest-level syntactic conformance to the XML spec. I >need to be able to determine that the XML in question conforms to the >syntactic and semantic constraints imposed by my application. I've never seen an application so simple that its syntactic/semantic constraints could be expressed in a schema, DTD or any other flavor. That's why every commercial DBMS-based app has zillions of lines of data validation code that have to be run before you actually use incoming data. Having said that, I think that validation is a good thing and essential in lots of applications, and will become a better thing once we have a more modern schema facility. >Furthermore, I don't want to have to rely on implicit knowledge contained >within a proprietary parser in order to do so. In my experience, you *always* have to write some application-specific validation code. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at qub.com Mon Feb 22 19:48:42 1999 From: paul at qub.com (Paul at Sunnyvale) Date: Mon Jun 7 17:09:20 2004 Subject: Announce: DTDGenerator frontend. Message-ID: <002701be5e9c$8d46f760$1bd3d6cf@g0f2n0> DTDGenerator Frontend allows you to upload some XML file from your computer to the server and get a DTD to which the document would conform. DTDGenerator is written by Michael Kay of ICL DTDGenerator Frontend runs at: http://www.pault.com/Xmltube/dtdgen.html Rgds. Paul. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990222/5658b9c3/attachment.htm From david at megginson.com Mon Feb 22 20:18:45 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:20 2004 Subject: Validation Conditions (was Streaming XML (was ...)) In-Reply-To: <93CB64052F94D211BC5D0010A80013310EB327@wwmessd3.bra01.icl.co.uk> References: <93CB64052F94D211BC5D0010A80013310EB327@wwmessd3.bra01.icl.co.uk> Message-ID: <14033.47885.119167.959662@localhost.localdomain> Michael.Kay@icl.com writes: > > > My first cut at it says that all VCs except the following can be > > > checked given the full left context (in stream terms, all that > > > has come before): > > Which reminds me that SAX still needs some kind of statement about when > errors are detected and reported. If we ask a ModSAX parser to do > validation, do we get a guarantee that a duplicate ID attribute will be > reported when the first duplicate value is encountered, or is the parser > allowed to wait until end of document? I don't think that I'd want to constrain implementations that tightly -- what does everyone else think? All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Mon Feb 22 20:36:17 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:20 2004 Subject: SAX and Standards Bodies (was Streaming XML?) In-Reply-To: <14033.38676.860585.23146@localhost.localdomain> References: <36CDE070.3EB1A411@locke.ccil.org> Message-ID: <199902221808.NAA31774@hesketh.net> At 12:48 PM 2/22/99 -0500, David Megginson wrote: >Thank you very much for the suggestion (and for the implicit vote of >support), but I'm afraid that handing SAX over to a standards body >would be its doom. > >Right now, SAX is an open spec designed in public view on the XML-Dev >mailing list -- we all get to see each-others' unpolished, stupid >ideas as soon as they emerge, and we knock them back and forth until >they look like something. > >The alternative would be to hand SAX over to a standards body that >might or might not set up a working group, that might or might not in >a year or two put out a working draft, that almost certainly wouldn't >look anything like the SAX we all know today. I agree completely with David - SAX's strength comes from its openness rather than the imprimatur of a group of vendors. Given the (still) early phase of XML development, keeping that process open to as many potential implementors and users (not just commercial vendors) is critical to the success of SAX as a common interface. In this case, the W3C's neglect of a spec is, I think, benign. It also sets an encouraging standard of its own as an open-process spec (not just open-source software) that's doing quite well. Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Mon Feb 22 20:38:05 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:20 2004 Subject: events vs callbacks (was Re: SAX2 (was Re: DOM vs. SAX??? Nah. )) Message-ID: <001301be5ea2$74d3c0e0$c9a8a8c0@thing2> From: Gabe Beged-Dov >I was thinking about SAX2 and Bill Megginsons' message about the to and fro that makes open >development preferable to "standards" development. I think it is very valid and so I decided >to throw out my unpolished ideas :-) I'll take that as a complement. But its David Megginson and Bill la Forge. >I agree with this but I see a different advantage to making the events first class objects >rather than implicit in the callback parameter list. That advantage is decoupling control >flow. If someone wants to use the default control flow policy (as currently implemented) >they can immediately dispatch the event using the parser's thread of control. If they want a >stream based application processor that has its own thread of control, they can push the >events onto an event queue. Overhead is an issue. Event objects really do simplify a lot of things, especially filters. Interfaces are faster. Worse, if the parser pulls the same tricks with Event objects as are currently done with AttributeList (i.e. reusing the same object over and over), you must then clone the event before adding it to the queue. There are lots of things we could do if we had event objects, especially with control flow. (And there's a lot of mess in MDSAX because we do not use event objects!) But parser speed is the key feature. For now. Though if we go with Simon's layered architecture, we might actually get a speed gain. But there's no question that the code would be a whole lot smaller and easier to understand. And that may be justification enough. I would not consider this for MDSAX. But I would still argue in favor of it for SAX2. But not without more support from the XML community. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rschoening at unforgettable.com Mon Feb 22 20:55:42 1999 From: rschoening at unforgettable.com (Rob Schoening) Date: Mon Jun 7 17:09:20 2004 Subject: SAX and Standards Bodies (was Streaming XML?) In-Reply-To: <199902221808.NAA31774@hesketh.net> References: <199902221808.NAA31774@hesketh.net> Message-ID: <0003447a4a7fcbb1_mailit@mail.iname.com> >It also sets an encouraging standard of its own as an open-process spec >(not just open-source software) that's doing quite well. Well said. Rob xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Mon Feb 22 21:52:56 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:09:20 2004 Subject: Documents and Document Fragments (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) Message-ID: James Tauber wrote: > Mark Birbeck wrote: > > Of course, if all of > >your documents (logical) are stored as text files > (physical), or to put > >it another way, if there is a one-to-one mapping between > your physical > >and logical XML documents, then none of this is of any use to you; > > XML documents = text files. What you are calling "logical XML > documents" > aren't "XML documents" in the sense of the XML 1.0 REC. I'm > not arguing > about the value of what you are talking about doing. I think > it's the way to > go. I am just trying to be careful with the terminology. I apologise for this James. I was using the terms in a loose sense, and I had no idea that they were actually used in XML 1.0. I've gone back to the spec and I see exactly what you mean. Sorry ... So, I'll try to re-state my case, this time without mixing up the terms: XML 1.0 refers to a 'data object' that can be an 'XML document', providing it meets certain criteria (namely having prolog, element and misc.) There is no discussion of the means by which this 'data object' is conveyed to a parser, other than it must meet the criteria for a 'physical' XML document. This is relevant to the question of serialisation in two ways. First, there is no reason why a stream cannot contain many of these 'data-objects-that-conform-to-the-criteria-of-an-XML-document'. Secondly, those who want 'infinite documents' are challenging XML 1.0 pretty much at its opening paragraph! To be an XML document, your 'data object' - in whatever form, whether text file or stream - must be 'well-formed'. So, by definition you cannot have an XML document with no closing tag. [I'm not accusing you of these things James, just thought I'd slip it in to subliminally back up my other thread on streaming.] It's also relevant to document fragments. In previous posts, I was trying to say that as far as a parser is concerned, whether it receives a complete XML document by retrieving a file from a disk, a page from a web server, or four nodes from an object database is neither here nor there. As far as it is concerned, it has an 'XML document'. I called this a 'logical' document because I wanted to indicate that it may not actually exist in any physical form, but it is a 'data-object-that-conforms' item, and that if we can process an 'XML document' we can process one node, many nodes or the whole tree. You don't then need to devise another system to process well-formed 'uberdocuments', and yet another to process well-formed 'document fragments' or 'microdocuments' or whatever. However, unfortunately for me ;-), XML 1.0 uses these terms quite specifically, and what I have called the 'logical' representation of an XML document, 1.0 has as a 'physical' document, that is a sequence of elements, attributes, comments, etc. So, in the XML 1.0 terminology my database full of objects actually contains 'logical' documents, in the sense that they are XML documents in some abstract way - they have a hierarchy, attributes, and so on. When the database is queried I create 'physical' manifestations of those 'logical' documents (but not in the physical sense of a text file, rather in the sense of a sequence of characters that can be parsed) which can themselves be made up of many 'logical' documents. My point was to argue that we do not need notions of uberdocuments, or document fragments to solve problems of streaming or storing documents in databases - the terms already exist in XML to understand this. For example, I saw a site the other day that used the term Microdocuments to explain a database product. Initially I thought that was quite a useful way of looking at it, but on re-examining 1.0 you realise that what they - and we - have is simply a 'logical-to-physical' XML document server, which you would be hard pushed to say was not anticipated in the spec. (Of course I recognise the need to put this in terms others will understand, and hence I'm quite jealous they thought of 'Microdocument' first! But that's marketing.) This is therefore why I said: > >I pointed out that all this fits with the XML 1.0 notion of a logical > >document, in order to stress that we don't need some other terms > >inventing to cope with these concepts. Of course, now I realise that to be precise I should have said 'physical' instead of logical, in that it's 'a sequence of characters in some form or other, that a parser could interpret as a complete XML document'. Anyway, thanks for being pedantic, because I now see how I have confused the issue - apologies to anyone who read the postings! But despite my confusing use of the words, I still think my main point is right! - we don't need extra terminology Best regards, Mark PS Got your other mail. Will get in touch next week - got to get the site live this week. Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From arabbit at earthlink.net Tue Feb 23 00:38:36 1999 From: arabbit at earthlink.net (Paul Butkiewicz) Date: Mon Jun 7 17:09:20 2004 Subject: Streaming XML? [was Re: XML Information Set Requirements..] In-Reply-To: Message-ID: <000601be5ec4$f782d4a0$9636bfa8@arabbit> Hi Tim. I did some work with this a few months ago and took the each tick is a document approach. Unfortunately, I haven't really revisited it much since. If you're interested, check out http://home.earthlink.net/~arabbit/xmlnet I'm not trying to sell anything (there was a commercial idea with some friends originally, but, eh. . .), and you can have the source code if you want. Just curious, if you wanted someone to bounce ideas off of, etc. Paul Butkiewicz arabbit@earthlink.net -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of Tim.Shaw@wdr.com Sent: Monday, February 22, 1999 4:39 AM To: cowan@locke.ccil.org Cc: xml-dev@ic.ac.uk Subject: Streaming XML? [was Re: XML Information Set Requirements..] This was one of my initial concerns on the DOM - I too will be working on ticker-type streams. I nearly jumped in early in this conversation to agree with the 'others' - ie the SAX approach is just as important as DOM and the 2 should both be of similar (standards) status. I held off because I'm new to XML, and I wasn't sure if my alternative wasn't too obvious to mention. I thought that each tick should be treated as a document. The client would have to arrange that each 'document' so received would have to be built, and then a document fragment extracted and inserted into a client-held document (updating display etc. via DOM events as appropriate). This approach requires the client of the XML data-source to be aware of this of course, which may be a killer for genericity. Am I way off base here - and if so, could someone point me to current thinking on this issue (I've browsed _a lot_ and seen little/nothing). Thanks tim ______________________________ Reply Separator _________________________________ Subject: Re: XML Information Set Requirements, W3C Note 18-February Author: cowan (cowan@locke.ccil.org) at unix,mime Date: 19/02/99 22:06 Nathan Kurz wrote: > And if the stream is continuous (for example, an XML > stock ticker) even the concept of a well-formed stream seems tenuous. It's not clear that XML supports infinitely long streams (where the end-tag of the document element is *never* reached). -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981 -02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Tue Feb 23 02:05:26 1999 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:09:20 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) References: Message-ID: <36D20C34.3269@hiwaay.net> Mark Birbeck wrote: > A consistent theme in this discussion forum is that people always want > every major breakthrough that has been made by XML to be removed, under > the pretence of coping with some 'special circumstances'. If you think > about it, it is quite unique in the history of software engineering to > have an agreed standard which allows us to check whether a document that > we had no part in designing the layout for, is valid. Not to be tendentious, but some of us having been doing that for at least a generation now. SGML works. The concept of well-formed documents was introduced with XML. AFAIK, no one has proposed removing the use of DTDs or alternative schemas. If there is a significant breakthrough, it has been the introduction of two API/interface standards, one formal, DOM, and one grassroots, SAX, both of which are there to solve different but related problems. The interface standards for markup are unique. Some of us would have killed for those about five years ago. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Tue Feb 23 02:22:50 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:09:20 2004 Subject: Streams, protocols, documents and fragments was: RE: Documents and Document Fragments (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) Message-ID: <00c101be5ed3$723f1600$0300000a@othniel.cygnus.uwa.edu.au> Jonathan Borden: > I think this whole discussion is getting muddled because terminology of >different domains is being interchanged. Perhaps. I am quite happy for people to use the term "document" or "logical document" or "physical document" in a general sense. It is the term "XML document" I am trying to be careful about. I'm not talking about streaming, or any particular application, just trying to be clear on what is and what isn't an "XML document". My contention is merely that: >> I don't believe you have an "XML document" until you serialise as well-formed >> XML text. Consider: class Element { String gi; String content; Element(String gi, String content) { this.gi = gi; this.content = content; } String toXML() { return "<"+this.gi+">"+this.content+""; } Now if I say Element document = new Element("greetings","Hello!"); Then 'document' is *not* an XML document. It is a Java object. If I say Element el = new Element("greetings","Hello!"); String document = el.toXML(); Then 'document' *is* (potentially) an XML document (and also a Java object). To put it crudely, it's gotta have the angled brackets to be XML. That's my point. It's one I am quite willing to be corrected on, if someone can show that the XML 1.0 REC allows for an XML document that doesn't begin with the text "<" optionally preceded by whitespace. [greetings {Hello!}] is a valid representation of the logical structure of an XML document. It isn't an XML document. >document is defined as in the XML spec. documents are well >formed. when a document fragment is isolated from its parent document, it >becomes a standalone document. In this thread I have tended to qualify "document", i.e. "XML document" to (try to) avoid confusion with any discussion of documents in general. You don't define document fragment anywhere. :-) [...] >So, the problem here is not one with XML, rather the protocol used to >transmit documents, HTTP and SMTP send one MIME message per PDU, >streaming protocols can be defined which transmit multiple documents. I agree. This is not what I am arguing about. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Tue Feb 23 02:23:16 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:09:20 2004 Subject: localised declarations (Re: XML Information Set Requirements, W3C Note 18-February-1999) Message-ID: <00c201be5ed3$7374d6e0$0300000a@othniel.cygnus.uwa.edu.au> >Why not have "from here on use these declarations" as the default behavior >and then introduce 'push' and 'pop' constructs? 'push' would save current >declaration settings and the 'pop' would just restore to the saved settings. Could do. If this were the case, I wouldn't use attributes as they would confuse the scope issue. Better to use PIs or empty elements in some namespace. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Tue Feb 23 02:23:19 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:09:21 2004 Subject: Streams, protocols, documents and fragments was: RE: Documents and Document Fragments (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) Message-ID: <00c301be5ed3$7541a7a0$0300000a@othniel.cygnus.uwa.edu.au> Mark Birbeck: >The term 'document fragment' is not in XML 1.0, and my point was that we >don't need new terminology - uberdocuments, document fragments, and so >on - to understand these concepts. All you have said is that an XML >document can have a prolog ... or not. If you give me a well-formed 'XML >document' I have no way of knowing where that came from. It could be a >standalone text file, or it could be a node from a larger XML document, >but where it came from isn't going to help it; it will stand or fall on >its own merit - i.e., is it well-formed? So why confuse things with all >these different notions? Jonathan argued by assertion that document fragments are well-formed documents. What if they are not? If the original big document (no I'm not introducing a new term :-)) is in a character encoding other than UTF-8 or UTF-16, the document fragment achieved by plucking out a particular element is not a legal XML document. It would need to have a prolog to specify the encoding or some other method to declare the same. Furthermore, if the element being plucked out contained entity references, you would either need, again, to have a prolog to declare the entities, or include the replacement text of the entities, or use some other method to declare the entities. Mark, what are your views on the W3C's activity on XML Fragments? Do I infer correctly that you disagree with the need? James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Tue Feb 23 02:29:42 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:09:21 2004 Subject: More XML Identity Angst Message-ID: <00e201be5ed4$5842c0c0$0300000a@othniel.cygnus.uwa.edu.au> It's no surprise to find that David Megginson has expressed what I have been trying to say far more clearly and politely that I was able. To quote: "XML is a markup standard that describes how to represent a hierarchical structure in a linear sequence of characters." "I believe that most application designers will work from the abstractions in the model (often, as reflected through an interface) rather than directly with the markup, but still, even if an abstract model is better, that model is *not* XML; it is simply derivable from XML." Thank you David. :-) James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From begeddov at jfinity.com Tue Feb 23 03:19:17 1999 From: begeddov at jfinity.com (Gabe Beged-Dov) Date: Mon Jun 7 17:09:21 2004 Subject: events vs callbacks (was Re: SAX2 (was Re: DOM vs. SAX??? Nah. )) References: <001301be5ea2$74d3c0e0$c9a8a8c0@thing2> Message-ID: <36D21E69.B78311D5@jfinity.com> Bill la Forge wrote: > Overhead is an issue. Event objects really do simplify a lot of things, especially filters. > Interfaces are faster. SAX is described as an event-based API. IMO, it is a callback based API. Iam guessing that many will find this either a debateable distinction or one not worth dwelling on. I feel it is worth distinguishing. This is separate from the issue of whether it is worthwhile to either replace the callback API with an event API, or layer an event API on top of the callback API. > > Worse, if the parser pulls the same tricks with Event objects as are currently done with > AttributeList (i.e. reusing the same object over and over), you must then clone the > event before adding it to the queue. The issue of memory management and ownership rears its ugly head again :-(. This seems to argue for a eventgen filter for SAX. Given the memory management and efficiency issues, event queuing would need to be layered on top of, rather than instead of, the callback API. If you assumed that an event API replaced the callback API due to your extensibility argument, then I wonder if you couldn't provide a configuration parameter to the SAX driver on whether to clone or reuse the event objects. > There are lots of things we could do if we had event objects, especially with control flow. > (And there's a lot of mess in MDSAX because we do not use event objects!) > But parser speed is the key feature. For now. I can see that the speed of the parser subsystem argues for the current approach. Especially since the existing base of parsers use this model. Its not clear to me that the speed of a system which itegrates a SAX based parser is necessarily enhanced by the current model. I have two issues with the current approach. One is the stated one with event vs. callback based API. The other is more related to parser architecture and single threaded runtime environments. AFAIK, the current crop of parsers and SAX all assume that they are passed a thread of control and in turn pass this thread to the callbacks registered by the application. In single-threaded invironments this means that the parser is the center of the universe until the document is completely processed. It would be nice if there was also a "fragment" sequence interface like that used by the HTML parser in Perl. I.e. each call to "parse" provides the next chunk of input forming the document. This is also useful in a multi-threaded runtime since the application can control the chunking directly rather than indirectly thru thread synchronization mechanisms. > Though if we go with Simon's layered architecture, we might actually get a speed gain. > But there's no question that the code would be a whole lot smaller and easier to > understand. And that may be justification enough. I looked at MDSAX over the weekend, and it is certainly a powerful platform for SAX based processing. On the other hand, it seemed that trying to fit everything into a single filter network without any lookahead capability (would require queueing) and cumbersome lookbehind capability (such as in the flatten example) is problematic. Given the constraints of making use of the existing infrastructure (SAX, XML) you have created a very flexible framework. Its not clear if the constraints are the right ones for trying to support composition of processing like you envision. Gabe Beged-Dov www.jfinity.com > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Tue Feb 23 04:57:43 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:09:21 2004 Subject: Streams, protocols, documents and fragments was: RE: Documents and Document Fragments (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) In-Reply-To: <00c301be5ed3$7541a7a0$0300000a@othniel.cygnus.uwa.edu.au> Message-ID: <000801be5ee8$48b3d040$d3228018@jabr.ne.mediaone.net> James Tauber wrote: > > > Jonathan argued by assertion that document fragments are well-formed > documents. > What if they are not? > > If the original big document (no I'm not introducing a new term > :-)) is in a > character encoding other than UTF-8 or UTF-16, the document fragment > achieved by plucking out a particular element is not a legal XML document. > It would need to have a prolog to specify the encoding or some > other method > to declare the same. You are correct. not all document fragments are well-formed documents. The point I am trying :-) to make is that the need to consider information streams as infinitely long 'documents' composed of 'document fragments' which contain units of information (e.g. the stock ticker) is just an artifact of protocols which demand that a stream contain a single document. Assuming, then, that one could apply prologs to a stream which is composed of multiple document fragments, it might be possible to transmit the same information via a stream of document fragments as via a stream of documents. I don't currently see a need to propose an infinitely long XML document as the solution to a problem. (that is I can solve this problem at the protocol level). Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Tue Feb 23 05:02:25 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:09:21 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) In-Reply-To: <36D20C34.3269@hiwaay.net> Message-ID: <000901be5ee8$f9c407b0$d3228018@jabr.ne.mediaone.net> len bullard wrote: > > The concept of well-formed documents was introduced with XML. AFAIK, no > one has proposed removing the use of DTDs or alternative schemas. If > there > is a significant breakthrough, it has been the introduction of two > API/interface standards, one formal, DOM, and one grassroots, SAX, both > of which are there to solve different but related problems. The > interface > standards for markup are unique. Some of us would have killed for > those about five years ago. > Isn't that a grove? I'm saying this because when I look at the interfaces in Jade's groveoa, they look alot like the DOM (not that James defines what a grove is, but by implication I assume that this is at least what he thinks :-) Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Tue Feb 23 05:36:12 1999 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 17:09:21 2004 Subject: ModSAX (SAX 1.1) Proposal References: <002501be5d74$d3ea26c0$2ee044c6@arcot-main> Message-ID: <36D0DEAD.F5E8B019@jclark.com> Don Park wrote: > >parser.setHandler("org.xml.sax.namespace", nsHandler); > > > >Now the handler org.xml.sax.namespace will need to be of some specific > >type, org.xml.sax.NamespaceHandler, say. What needs to be checked is > >that nsHandler is of type org.xml.sax.NamespaceHandler. Using > >ModHandler doesn't do that. > > It does if org.xml.sax.NamespaceHandler implements ModHandler. > > public class NamespaceHandler implements ModHandler {} > > ModParser parser; > try > { > parser.setHandler("org.xml.sax.namespace", new NamespaceHandler()); > } > catch (Exception ex) > {} My point is it doesn't stop you doing: public class PingHandler implements ModHandler { } ModParser parser; parser.setHandler("org.xml.sax.namespace", new PingHandler()); Declaring the second argument to be of type ModHandler doesn't ensure that it is of the correct type. It's not type-safe. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Tue Feb 23 05:52:34 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:09:21 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) References: <000901be5ee8$f9c407b0$d3228018@jabr.ne.mediaone.net> Message-ID: <36D23B4F.CB3B05C9@prescod.net> "Borden, Jonathan" wrote: > If > > there > > is a significant breakthrough, it has been the introduction of two > > API/interface standards, one formal, DOM, and one grassroots, SAX, both > > of which are there to solve different but related problems. The > > interface > > standards for markup are unique. Some of us would have killed for > > those about five years ago. > > > > Isn't that a grove? You can easily generate an API from a property set, but groves didn't exist five years ago. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "In general, as syntactic description becomes deeper, what appear to be semantic questions fall increasingly within its scope; and it is not entirely obvious whether or where one can draw a natural bound between grammar and 'logical grammar'." - Noam Chomsky, 1963 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From donpark at quake.net Tue Feb 23 05:57:29 1999 From: donpark at quake.net (Don Park) Date: Mon Jun 7 17:09:21 2004 Subject: ModSAX (SAX 1.1) Proposal Message-ID: <000c01be5ef1$2b8b47b0$2ee044c6@arcot-main> >Declaring the second argument to be of type ModHandler doesn't ensure >that it is of the correct type. It's not type-safe. ModHandler obviously does not offer absolute type-safety. I still find it worth the trouble, I guess you do not. Don Park Docuverse xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chtino at hnc.co.kr Tue Feb 23 07:31:04 1999 From: chtino at hnc.co.kr (Chung, Byung Hee) Date: Mon Jun 7 17:09:21 2004 Subject: XML implementation in MS-Word Message-ID: <36D2589C.23278129@hnc.co.kr> How about XML implementation in MS-Word ? Does MS-Word support Structured Editing ? If it is true, I want to know details If it is not, why did they give up structued editing matket ? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From hfhuang at iii.org.tw Tue Feb 23 07:34:46 1999 From: hfhuang at iii.org.tw (Eric Huang) Date: Mon Jun 7 17:09:21 2004 Subject: XML AND Semiconductor industry? Message-ID: <36D25841.13AEEEDA@iii.org.tw> Hello world, I am very interested in XML. It seems to be a revolution to the Web and the related domains. As for I know, there is a Pinnacle Information Components eXchange (PCIS) in the IC industry. PCIS is the standard for IC component data exchange. The RosettaNet also define some DTD for PC industry. For early stage in IC manufacturing - from IC design houses, fabs, packaging company- is there any data exchange standard constructed in XML for data exchange. Any information will be very appreciated. Thank you! -- Eric Huang Institute for Information Industry 11FL., 116, Nanking E. RD., SEC. 2, TAIPEI, TAIWAN R.O.C. Phone: +886-2-2542-2540 Ext. 204 Fax: +886-2-2563-4209 E-mail: hfhuang@iii.org.tw xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Tue Feb 23 09:47:28 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:09:21 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A1635@eukbant101.ericsson.se> > -----Original Message----- > From: Mark Birbeck [SMTP:Mark.Birbeck@iedigital.net] > > John Cowan wrote: > > Mark Birbeck wrote: > > > > > > > MSFT > > > 1000 > > > > > > > > > and then at the end of the day, sending: > > > > > > > > > No-one so far in the discussion has argued that this is good XML - > > > > I so argue. It is well-formed, though not valid, XML. > > It's NOT well-formed until the end of the day, when you receive the > closing tag. Until that time 'stockPrices' is not a complete element, > and therefore not a complete XML document. > > > Validity inherently can't be checked until you've processed > everything. > > Nor can well-formedness. > I don't want to argue this - just add a point. It really depends whether you're interested in proving the positive or the negative. If you're trying to prove that it is well formed you're SOOL until the end. But at least you can prove that it's not-not well formed so far in the stream... It's a compromise I guess. Still - I agree with your general point. What would be better would be a stream of XML documents - like network packets - fully self contained. Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Tue Feb 23 14:32:44 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:09:21 2004 Subject: XTech '99 starts in two weeks References: <199902212157.NAA03255@boethius.eng.sun.com> Message-ID: <36D2AB9B.50B4BE01@prescod.net> Jon Bosak wrote: > > Final reminder: XTech '99, the third annual West Coast XML conference, > starts in two weeks at the San Jose Convention Center. This is the > place to learn about the latest developments in XML and related > technologies from the people and companies at the center of the XML > revolution. One of the features of that conference will be a tutorial on Python in XML. Although it is only scheduled for a half a day I intend to be available to registered attendees for the entire conference period. During the scheduled half-day we will map out the various neat things you can do with Python and XML. Over the other days I will meet with the attendees to help them explore specific options such as Python's event-driven APIs, tree-based APIs, non-XML parsing (for upconversions), formatting and so forth. From my biased point of view, this is an excellent opportunity for anyone thinking about writing XML software. -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "In general, as syntactic description becomes deeper, what appear to be semantic questions fall increasingly within its scope; and it is not entirely obvious whether or where one can draw a natural bound between grammar and 'logical grammar'." - Noam Chomsky, 1963 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nwoh at software-ag.de Tue Feb 23 14:33:15 1999 From: nwoh at software-ag.de (Hutchison, Nigel) Date: Mon Jun 7 17:09:21 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) Message-ID: <005355AD0596D211B4F30000F81B0D324C9632@daemsg01.software-ag.de> How about defining a streaming or cursor DTD Example instance I > -----Original Message----- > From: Matthew Sergeant (EML) [SMTP:Matthew.Sergeant@eml.ericsson.se] > Sent: Tuesday, February 23, 1999 10:47 AM > To: 'Mark Birbeck'; XML Dev > Subject: RE: Streaming XML (Was RE: XML Information Set Requirements, > W3C Note 18-February-1999) > > > -----Original Message----- > > From: Mark Birbeck [SMTP:Mark.Birbeck@iedigital.net] > > > > John Cowan wrote: > > > Mark Birbeck wrote: > > > > > > > > > > MSFT > > > > 1000 > > > > > > > > > > > > and then at the end of the day, sending: > > > > > > > > > > > > No-one so far in the discussion has argued that this is good XML - > > > > > > I so argue. It is well-formed, though not valid, XML. > > > > It's NOT well-formed until the end of the day, when you receive the > > closing tag. Until that time 'stockPrices' is not a complete element, > > and therefore not a complete XML document. > > > > > Validity inherently can't be checked until you've processed > > everything. > > > > Nor can well-formedness. > > > I don't want to argue this - just add a point. > > It really depends whether you're interested in proving the positive > or the negative. If you're trying to prove that it is well formed you're > SOOL until the end. But at least you can prove that it's not-not well > formed > so far in the stream... It's a compromise I guess. > > Still - I agree with your general point. What would be better would > be a stream of XML documents - like network packets - fully self > contained. > > Matt. > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on > CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990223/1e9d3bee/attachment.htm From tbray at textuality.com Tue Feb 23 16:36:58 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:21 2004 Subject: XML implementation in MS-Word Message-ID: <3.0.32.19990223061809.00bcfa70@pop.intergate.bc.ca> At 04:28 PM 2/23/99 +0900, Chung, Byung Hee wrote: >How about XML implementation in MS-Word ? >Does MS-Word support Structured Editing ? No, no. >If it is true, I want to know details >If it is not, why did they give up structued editing matket ? They're not convinced it exists. Microsoft doesn't build products that they can't ship one million of. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gkholman at CraneSoftwrights.com Tue Feb 23 17:45:40 1999 From: gkholman at CraneSoftwrights.com (G. Ken Holman) Date: Mon Jun 7 17:09:21 2004 Subject: ANNOUNCE: XML Conformance Public Information Page Message-ID: On behalf of the OASIS XML Conformance Technical Subcommittee, we are pleased to announce that our public information page is now available for review. http://www.oasis-open.org/committees/xmlconf-pub.html Please watch this space for announcements regarding the public availability of our results. Mary Brady, NIST G. Ken Holman, Crane Softwrights Ltd. OASIS - The Organization for the Advancement of Structured Information Standards http://www.oasis-open.org xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Feb 23 17:58:50 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:22 2004 Subject: Streaming XML In-Reply-To: <5F052F2A01FBD11184F00008C7A4A800022A1635@eukbant101.ericsson.se> References: <5F052F2A01FBD11184F00008C7A4A800022A1635@eukbant101.ericsson.se> Message-ID: <14034.52352.23099.736267@localhost.localdomain> Matthew Sergeant (EML) writes: > Still - I agree with your general point. What would be better would > be a stream of XML documents - like network packets - fully self > contained. Such a stream could be brain-dead, bonehead simple to construct -- for example, formfeed (^L) is not allowed in XML documents, so you could have a UDP stream consisting of a bunch of self-identifying XML snippits separated by ^L -- to resync, the client just has to wait for the ^L and then start reading (my example uses the two characters "^L" rather than an actual formfeed to avoid screwing up mail readers): ^L 19990223T104900-0500 NASDAQ MSFT Microsoft +3.5 ^L 19990223T104901-0500 NASDAQ AMZN Amazon +7 ^L 19990223T104903-0500 NASDAQ YHOO Yahoo +7.25 ^L All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From scot at ntronix.com Tue Feb 23 18:07:34 1999 From: scot at ntronix.com (Scot Wingo (nTronix)) Date: Mon Jun 7 17:09:22 2004 Subject: FYI: MS to ship IE5 on March 18th... Message-ID: <001d01be5f46$4ff7ec70$c6a756d1@tauntaun> Thought you guys would be interested in knowing this -> http://www.techweb.com/wire/story/TWB19990222S0024 http://www.infoworld.com/cgi-bin/displayStory.pl?990222.eiie52.htm Scot Wingo nTronix xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Tue Feb 23 18:10:21 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:09:22 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) Message-ID: len bullard wrote: > Mark Birbeck wrote: > > A consistent theme in this discussion forum is that people > always want > > every major breakthrough that has been made by XML to be > removed, under > > the pretence of coping with some 'special circumstances'. > If you think > > about it, it is quite unique in the history of software > engineering to > > have an agreed standard which allows us to check whether a > document that > > we had no part in designing the layout for, is valid. > > Not to be tendentious, but some of us having been doing that for at > least a generation now. SGML works. Hello Len, I was using my words carefully! SGML has of course been around for a while, and been very important for document management and generation. But in the world of software engineering we are constantly reinventing the wheel; coming up with new file formats, parsers, transmission mechanisms, and so on. So, in the world I'm in, XML *is* a unique and unprecedented contribution. Regards, Mark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From fred.eisele at eds.com Tue Feb 23 18:13:24 1999 From: fred.eisele at eds.com (Eisele, Fred) Date: Mon Jun 7 17:09:22 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) Message-ID: This is off the top of my head but... Presume that the XML engine has the job of determining whether a particular stream is a theorem of a particular language. Ultimately, the stream is either a well-formed-formula (wff) or it is not, and if it is a wff it may also be a theorem, i.e. valid. There is another category before the wff (the name escapes me). In the context of NFA (regular expressions (which I realize XML is not)) the automata goes through many states, some of which indicate that the formula is not well-formed, but until that time the formula is tentatively accepted. It seems to me that this is the status of a stock ticker until the final end tag is set. In addition to this is the general situation that wff's are built up from other wff's. This is what seems to be going on with the Streaming XML. That's all. Thanks xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Tue Feb 23 18:22:59 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:22 2004 Subject: Streaming XML Message-ID: <3.0.32.19990223102125.00b5e7d0@pop.intergate.bc.ca> At 10:50 AM 2/23/99 -0500, David Megginson wrote: >Matthew Sergeant (EML) writes: > > > Still - I agree with your general point. What would be better would > > be a stream of XML documents - like network packets - fully self > > contained. > >Such a stream could be brain-dead, bonehead simple to construct -- for >example, formfeed (^L) is not allowed in XML documents, so you could >have a UDP stream consisting of a bunch of self-identifying XML >snippits separated by ^L -- to resync, the client just has to wait for >the ^L and then start reading (my example uses the two characters "^L" >rather than an actual formfeed to avoid screwing up mail readers): True - and to be pedantic, ^C () is probably a better choice, since its name is ETX, for End-of-text, and it was actually defined for this kind of use. It's in Unicode too, but, just like ^L, not in XML. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From oren at capella.co.il Tue Feb 23 18:37:10 1999 From: oren at capella.co.il (Oren Ben-Kiki) Date: Mon Jun 7 17:09:22 2004 Subject: Fw: Streaming XML Message-ID: <00bc01be5f5a$d5f9b600$5402a8c0@oren.capella.co.il> Tim Bray wrote: >True - and to be pedantic, ^C () is probably a better choice, >since its name is ETX, for End-of-text, and it was actually defined >for this kind of use. It's in Unicode too, but, just like ^L, not >in XML. -Tim ^C is inconvenient because typing it by hand causes an interrupt on popular operating systems. We have a test harness program which takes a mixture of commands and XML as input, and which is often run manually. We use ^L to separate between XML "documents" and find it very convenient. As a side benefit, printing input scripts starts each XML document on a separate page :-) Share & Enjoy, Oren Ben-Kiki xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Tue Feb 23 18:45:40 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:09:22 2004 Subject: Streaming XML Message-ID: <004e01be5f5c$fd7b9fc0$5cf96d8c@NT.JELLIFFE.COM.AU> From: Tim Bray >True - and to be pedantic, ^C () is probably a better choice, >since its name is ETX, for End-of-text, and it was actually defined >for this kind of use. It's in Unicode too, but, just like ^L, not >in XML. -Tim Tim isnt being pedantic: the control characters are provided specifically to allow in-band comms signalling. However, implementation is often OS-dependent and ratty. Whatever you use, make sure its not ^D,^Q,^S, or ~Z. I am not really sure that ^C would not also cause problems, because of its use as an abort character. In particular, control signals are often associated with DCE/DTE point-to-point signalling, and not end-to-end. You might send a ^C, but how that propogates to the other end is anyone's business. So a vacant tty control character like ^L might be useful. I'd love to hear the results of anyone testing this with XML over PPP. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Tue Feb 23 18:50:31 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:09:22 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) Message-ID: Hutchison, Nigel?wrote:? How about defining a streaming or cursor DTD No problem Nigel. That's one way to do it, but now we're out of the realm of XML and into the realm of the application ... which I think is right. ? Mark Birbeck Managing Director Intra Extra Digital Ltd. 39 Whitfield Street London W1P 5RE w: http://www.iedigital.net/ t: 0171 681 4135 e: Mark.Birbeck@iedigital.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Tue Feb 23 18:53:23 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:22 2004 Subject: ModSAX (SAX 1.1) Proposal In-Reply-To: <36D0DEAD.F5E8B019@jclark.com> References: <002501be5d74$d3ea26c0$2ee044c6@arcot-main> <36D0DEAD.F5E8B019@jclark.com> Message-ID: <14034.51540.25406.93518@localhost.localdomain> James Clark writes: > My point is it doesn't stop you doing: > > public class PingHandler implements ModHandler { } > > ModParser parser; > > parser.setHandler("org.xml.sax.namespace", new PingHandler()); > > Declaring the second argument to be of type ModHandler doesn't ensure > that it is of the correct type. It's not type-safe. Precisely, and this is really the critical design decision for the basic architecture of ModSAX. To put the problem succinctly, we are forced to trade-off type safety and flexibility/modularity to some degree: 1. Using inheritance allows strict typing but breaks chain-of-reponsibility, since every driver has to know about every interface, and breaks maintainability, since everything has to fall into a single inheritance tree. 2. Using runtime discovery (handlerID, etc.) breaks strict typing but allows chain-of-responsibility and greatly simplifies maintainability (we probably won't need another major SAX release when the schema group finishes datatyping, for example). The arguments for both are good; my proposal is that we use an empty interface (ModHandler) to allow at least *some* type checking, but that we require the drivers to do some of the type-checking themselves (or simply try to cast and let the exceptions bubble up). This isn't too big a burdon, since it falls on the drivers (few) rather than the applications (many). For example, here's how a typical filter might deal with a handler (assuming a helper class ModFilterImpl): public class PingFilter extends ModFilterImpl { public void setHandler (String handlerID, ModHandler handler) throws SAXNotSupportedException { if (handlerId.equals("http://www.megginson.com/sax/handlers/ping")) { this.handler = (PingHandler)handler; } else { getParentParser().setHandler(handlerID, handler); } } } The casting will throw a runtime exception if handler is not actually of type PingHandler; a clever driver could trap that and do something informative with it, but it is still not as good as catching the problem at compile time; on the other hand, people will be able to invent many new, interesting filters and drivers for SAX without waiting for me to change the core distribution. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Tue Feb 23 19:01:20 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:09:22 2004 Subject: ANNOUNCE: XML Conformance Public Information Page In-Reply-To: Message-ID: Hi Ken, [... announcement....] This is good news. In the page, it is said that the conformance report will be a xml document (I didn't expected less) and that it will be processed with DSSSL. Does this document has a particular DTD is it a public one? will it be possible to access the document from OASIS site in its original format with its corresponding dsssl stylesheet so that xml/dsssl aware browser could display it? Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Tue Feb 23 19:20:23 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:09:22 2004 Subject: FW: XML Mail References: <00f301be5c35$20627ca0$0fb919ce@Bertha> <36CDBA46.B13FB101@locke.ccil.org> Message-ID: <36D2FE88.12FEE8E4@manhattanproject.com> Sorry to revive a dead thread, but I'm making progress on the subject, so I'd like your feedback. A while back I wrote: > The rewrite program would: > 2) Leave valid XML/HTML alone if possible. To which David Megginson wrote: > Wrong -- or, to put it differently, it should leave content with > text/html and text/xml alone, but it should not try to recognise > markup in text/plain. Then, Parand Tony Darugar and Jonathan Borden posted their current implementations of such a monster. The debate then swung as to how to treat the "body" of the e-mail. Most people agreed that it was some sort of CDATA thing, where "]]>" in the text body would become "]]>" My question: Isn't there a way to do smart stuff like #2 ? For example, look at Didier's post below. It'd be slick to have the XMLMail program recognize his markup. For my example, I'd like to "embeed" bookkeeping information in my e-mail. If the e-mail leaves the 'organization', the bookkeeping is stripped. A wizened browser using style sheets may put the bookkeeping info off to the side, where an old e-mailer may show it in-line. This way I can do stuff like: Then I can run all of my e-mail through my journal, which then hits my ledger, and accounts for my time appropriately. I'm actually very serious about this... Best, Clark XML, it's not just for computers anymore. -------- Original Message -------- Subject: RE: Streaming XSL Date: Tue, 23 Feb 1999 12:08:08 -0500 From: "Didier PH Martin" Reply-To: xsl-list@mulberrytech.com To: HI Oren, The use of a "stream" media to specify this seems like a kludge (though I appreciate how you came to use it in the current framework). It really should be in the element. I agree 100%. Actually, if we look closely to XSL specs or to any other xml processing stuff. We have no common way to associate an interpreter to a XML document. A XML document without an associated interpreter is nothing more than just sleeping data in a serialized format. What we should have: ... xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Tue Feb 23 21:46:10 1999 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 17:09:22 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) References: Message-ID: <36D3217C.CC518A0C@allette.com.au> Mark Birbeck wrote: > len bullard wrote: > > Mark Birbeck wrote: > >> If you think about it, it is quite unique in the > >> history of software engineering to have an > >> agreed standard which allows us to check > >> whether a document that we had no part in > >> designing the layout for, is valid. > > > Not to be tendentious, but some of us having been doing that for at > > least a generation now. SGML works. > > Hello Len, I was using my words carefully! SGML has of course been > around for a while, and been very important for document management and > generation. But in the world of software engineering we are constantly > reinventing the wheel; coming up with new file formats, parsers, > transmission mechanisms, and so on. So, in the world I'm in, XML *is* a > unique and unprecedented contribution. Why do I feel as though I've just been patted on the head? -- Regards, Marcus Carr email: mrc@allette.com.au ___________________________________________________________________ Allette Systems (Australia) www: http://www.allette.com.au ___________________________________________________________________ "Everything should be made as simple as possible, but not simpler." - Einstein xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From srn at techno.com Tue Feb 23 22:15:21 1999 From: srn at techno.com (Steven R. Newcomb) Date: Mon Jun 7 17:09:22 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) In-Reply-To: <000901be5ee8$f9c407b0$d3228018@jabr.ne.mediaone.net> (jborden@mediaone.net) References: <000901be5ee8$f9c407b0$d3228018@jabr.ne.mediaone.net> Message-ID: <199902232043.OAA01619@bruno.techno.com> [Len Bullard: ] > > If there is a significant breakthrough, it has > > been the introduction of two API/interface > > standards, one formal, DOM, and one grassroots, > > SAX, both of which are there to solve different > > but related problems. The interface standards > > for markup are unique. Some of us would have > > killed for those about five years ago. [Jonathan Borden:] > Isn't that a grove? I'm saying this > because when I look at the interfaces in Jade's > groveoa, they look alot like the DOM (not that > James defines what a grove is, but by > implication I assume that this is at least what > he thinks :-) I certainly can't speak for James, but I would like to clear something up: ISO/IEC 10744:1997 ("HyTime") and, earlier, ISO/IEC 10179:1996 ("DSSSL") defined and coined the term "grove". A grove is the set of objects resulting from parsing an information resource in some specific notation. EVERY GROVE ALWAYS MUST CONFORM TO A FORMAL MODEL called a "property set". The "SGML property set" is one such property set, and it's the property set that governs the structure and nature of the objects to which Jade's groveoa interface provides access. Think of a property set as a schema for the objects that result from parsing (and/or from semantic processing, but that's another story for another day). The DOM is not a grove; it is an API. Until the XML information set is stable, the DOM is an API to something that's not rigorously defined. The DOM can be implemented as an interface to XML groves, but not before there are XML groves. And there can't be XML groves until there's a property set for XML. (Well, no, that's not quite right, because we routinely make SGML groves from XML documents. But that's just a temporary kludge that only works because of XML's SGML parentage. Moreover, the SGML Property Set provides for more complexity than XML groves will ever need to have, and simplicity is one of XML's most important virtues.) There is every reason to believe that the XML Information Set, once Recommended, will be expressible as a property set. Once this is done, XML objects will be processable, addressable, and re-usable via the same software that supports the processing, addressing, and re-use of components of resources expressed in other notations, with each such notation described by its own property set. In that scenario, all information components conform to the same object model, the ISO "grove" object model, so we are able to address (link, re-use) any kind of thing. -Steve -- Steven R. Newcomb, President, TechnoTeacher, Inc. srn@techno.com http://www.techno.com ftp.techno.com voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137) fax +1 972 994 0087 (at ISOGEN: +1 214 953 3152) 3615 Tanner Lane Richardson, Texas 75082-2618 USA xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ralph at fsc.fujitsu.com Tue Feb 23 22:25:33 1999 From: ralph at fsc.fujitsu.com (Ralph Ferris) Date: Mon Jun 7 17:09:22 2004 Subject: ANNOUNCE: XML Conformance Public Information Page In-Reply-To: References: Message-ID: <3.0.5.32.19990223142413.00ac65a0@pophost.fsc.fujitsu.com> Ken, At 01:32 PM 2/23/99 -0500, Didier PH Martin wrote: >Hi Ken, > >[... announcement....] > > >This is good news. > >In the page, it is said that the conformance report will be a xml document >(I didn't expected less) and that it will be processed with DSSSL. > > >Does this document has a particular DTD is it a public one? will it be >possible to access the document from OASIS site in its original format with >its corresponding dsssl stylesheet so that xml/dsssl aware browser could >display it? > I have the same question. "Processed using DSSSL" could mean running it through Jade to produce HTML (or RTF for that matter). If the output is in fact HTML, I'd still like to see it available as an XML document. "HyBrick" supports catalogs, so a public identifier for the DTD, together with the required DTDs to support the DSSSL architecture, would work. Best regards, Ralph E. Ferris Fujitsu Software Corporation xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lauren at sqwest.bc.ca Tue Feb 23 22:27:30 1999 From: lauren at sqwest.bc.ca (Lauren Wood) Date: Mon Jun 7 17:09:22 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) References: <000901be5ee8$f9c407b0$d3228018@jabr.ne.mediaone.net> <199902232043.OAA01619@bruno.techno.com> Message-ID: <36D32B38.B83E72DA@sqwest.bc.ca> "Steven R. Newcomb" wrote: > The "SGML property set" is one such property set, > and it's the property set that governs the > structure and nature of the objects to which > Jade's groveoa interface provides access. Think > of a property set as a schema for the objects that > result from parsing (and/or from semantic > processing, but that's another story for another > day). > > The DOM is not a grove; it is an API. And for those interested in the history of the DOM, we started with the SGML property set as a basis (along with the APIs implemented in various products such as HTML browsers and SGML editors). So it's not surprising that various APIs that deal with these sorts of documents have a similar feel to them, though they differ in the details. Lauren xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nate at valleytel.net Tue Feb 23 23:13:37 1999 From: nate at valleytel.net (Nathan Kurz) Date: Mon Jun 7 17:09:23 2004 Subject: Documents and Document Fragments Message-ID: <199902232312.RAA03802@trinkpad.valleytel.net> Mark Birbeck wrote: > It's also relevant to document fragments. In previous posts, I was > trying to say that as far as a parser is concerned, whether it receives > a complete XML document by retrieving a file from a disk, a page from a > web server, or four nodes from an object database is neither here nor > there. As far as it is concerned, it has an 'XML document'. I called > this a 'logical' document because I wanted to indicate that it may not > actually exist in any physical form, but it is a > 'data-object-that-conforms' item, and that if we can process an 'XML > document' we can process one node, many nodes or the whole tree. You > don't then need to devise another system to process well-formed > 'uberdocuments', and yet another to process well-formed 'document > fragments' or 'microdocuments' or whatever. Although it may reflect the state of existing parsers, I disagree with this assessment of how XML parsers must relate to 'XML documents' and 'document fragments'. It seems like it has things backwards. You imply that if a parser is able to process a collection of nodes in one particular form, that it is able to process a collection of nodes in any arrangement whatsoever. Perhaps, but not necessarily. An XML document has to have a root node. A subset of that document, produced by an XSL engine or by some other means, doesn't necessarily have a root node. An XML parser may or may not require that a document has a root node. Any parser capable of handling documents without a root will do fine if one exists, but the reverse it not necessarily true. Perhaps the question is whether there is a difference between an 'XML parser' and an 'XML document parser'. Which brings up the question of whether there is such as thing as XML (by definition well-formed) that is not a XML document. I think there is, and that this is where the term 'document fragment' is useful. Here's a simplified version of how I'd like the world to be defined: :) document fragment: A piece of well-formed XML, that may or may not have a root element. In the parlance of the spec, it would probably be called a 'well-formed textual object'. Doesn't even have to contain any elements. Colloquially synonymous with 'XML'. XML document: A document fragment that, in the words of the spec, 'when taken as a whole matches production labeled document'. In practice, some XML with a single root element. Parsed entities must also be well formed. XML parser: Something that accepts XML as input and/or treats its input as XML. May or may not care if the input is well-formed and/or valid. It's quite possible that cat(1) would qualify as an XML parser by this definition. XML document parser: An XML parser that expects an XML document as input, and complains if it does not receive one. A 'conforming XML processor' in the spec's terms. (Although the spec often uses just 'XML processor' and implies 'conforming'). Which is to say, that I think the notion of 'document fragment' is still useful, and that it is worthwhile to think about textual XML that is not in the form of an XML document. nathan kurz nate@valleytel.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nate at valleytel.net Wed Feb 24 00:05:04 1999 From: nate at valleytel.net (Nathan Kurz) Date: Mon Jun 7 17:09:23 2004 Subject: Streams, protocols, documents and fragments Message-ID: <199902240003.SAA03997@trinkpad.valleytel.net> Jonathan Borden writes: > document is defined as in the XML spec. documents are well > formed. when a document fragment is isolated from its parent document, it > becomes a standalone document. Sounds fine so far... > a document may contain a prolog. a document fragment may not. a document may > contain a !DOCTYPE definition (DTD), a document fragment may not. Hence all > document fragments are legal documents but not all documents are legal > document fragments. I think I follow what you are saying, but I'm confused why you would choose to define a document fragment in this way. Why can't it contain a prolog? Are you assuming that document fragment must be produced as a reduction of a parent document? It strikes me as very odd to define 'document fragment' as a superset of 'document'. > stream is ambiguous but generally refers to a series of bits or > bytes or characters. In general, a stream behaves similarly to a socket. Yes, or going further, a stream behaves similarly to a continuous unidirectional broadcast. The canonical stock ticker might well be continuously transmitting the XML data by FM radio. > protocol is layered above a network transport, or socket and > defines a mutually agreed upon mechanism to exchange messages and other > data. > > So what does this have to do with XML? The canconical example of streamed > XML is the stock ticker. Assuming each stock quote is transmitted in a > document, the HTTP protocol can employ a particular URL e.g., > http://wherever/quotes/next to return the next quote as a single document. > Suppose we wish to transmit 100 quotes as distinct documents, this does not > work with HTTP which returns a single MIME message response for each > request. The solutions would be to employ 1) multipart messages 2) wrap the > quotes in a single document 3) use another protocol. > > Suppose we use raw sockets? Nothing to prevent sending one document after > another down the socket. The end of one document and the start of another > are unambigous assuming the documents are well-formed. I completely agree. Which is to say that using a non-XML character (cntl-l or cntl-c) as a seperator might be a useful protocol, but is not necessary. Nothing prevents one from sending multiple documents serially as unadulterated XML. > So, the problem here is not one with XML, rather the protocol used to > transmit documents, HTTP and SMTP send one MIME message per PDU, streaming > protocols can be defined which transmit multiple documents. But the definition of XML processor does become a problem here. If the stream consists of multiple XML documents, one must use an XML-aware processor to parse it. But this had better be a non-conforming XML processor, since according to the spec a 'conforming XML processor' must cry foul if its input doesn't have one and only one root element. nathan kurz nate@valleytel.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Wed Feb 24 00:42:39 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:23 2004 Subject: events vs callbacks (was Re: SAX2 (was Re: DOM vs. SAX??? Nah. )) Message-ID: <003601be5f8d$dcfabd40$c9a8a8c0@thing2> From: Gabe Beged-Dov >Given the memory management and efficiency issues, event queuing would need to be layered >on top of, rather than instead of, the callback API. If you assumed that an event API replaced >the callback API due to your extensibility argument, then I wonder if you couldn't provide a >configuration parameter to the SAX driver on whether to clone or reuse the event objects. Sounds good to me! I also recall mention of at least one true event-based parser which had a SAX layer on top of it. Could be an interesting synergy, but I do like the independence SAX brings us all. >I looked at MDSAX over the weekend, and it is certainly a powerful platform for SAX based >processing. On the other hand, it seemed that trying to fit everything into a single filter >network >without any lookahead capability (would require queueing) and cumbersome lookbehind capability >(such as in the flatten example) is problematic. Given the constraints of making use of the >existing >infrastructure (SAX, XML) you have created a very flexible framework. Its not clear if the >constraints >are the right ones for trying to support composition of processing like you envision. Thankyou for your kind words. Lookahead, especially to the next sibling/peer element, could be quite handy. Especially when processing repeating elements into a table. But an event queue might be an expensive way to do that. As for program composition, it really just happened. I would not have expected to be able to do it without a DOM. Especially since id/idref is essential! But it works great. I've since cleaned it up, but that will need to wait for the production release. The key was the interface to the reference table. Rather than getting back a pointer when given an idref value, you pass in: o the idref value, o the object to be updated with the value associated with the id, and o the associated property putter method. The only restriction here is that the object associated with the id must implement the class which defines the property type. (This then works for both forward and backward references.) With IDREF's handled so neatly, and an application object associate with each element (which becomes the value associated with the id), everything just fell in place. Oh yes, the next release will support complete document web processing. At least for program composition. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Wed Feb 24 01:31:34 1999 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:09:23 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) References: <000901be5ee8$f9c407b0$d3228018@jabr.ne.mediaone.net> Message-ID: <36D355D1.68FB@hiwaay.net> Borden, Jonathan wrote: > > Isn't that a grove? I'm saying this because when I look at the interfaces > in Jade's groveoa, they look alot like the DOM (not that James defines what > a grove is, but by implication I assume that this is at least what he thinks > :-) I'd be the last person on earth to say what James thinks. James does that best. Still, yes I think that is where the information set requirements take one. As said elsewhere and repeated often, it comes down to the interfaces and properties. After the work done on interoperability and portability in earlier projects, we came down to two solutions (pretty much what the Chameleon project said we would): 1. Downtranslate to a common markup, eg, Rainbow DTD, HTML. This was fine for portability. It didn't really help interoperability. HTML is great for getting the party started. 2. Define a common meta-information set. More or less an abstract up-translation but not really. Nothing is translated. Property values are expressed in a MoreMetaThanMarkup superset. Of the two, the latter was more flexible but it still required a common API of some kind to address the problems of interoperability. That was then. Now we have XML. Essentially, it provides a standard for what most SGML systems implementors already knew to do. Simplify the parser by eliminating features, forbid some practices that were onerous (minimization, inclusions, exclusions, etc) and so forth. XML 1.0: AKA, SGMLAsPracticed. Good. Once done and the politics of imprisoning the Titans over, working groups formed for the next tasks of getting the APIs (eg, DOM, SAX) and now the Information Set (pick up the work done by HyTime and DSSSL). Finally, the markup community gets out of the quicksand of OneDTDShallBeSupreme and MyStyleBeatsYourContent. Excellent. Nothing suggested on XML-Dev is new news. When I see correspondents claim the cane-pounding elders are standing in the way of progress, I groan and know the work will just take a little longer. I see examples like SAX, and I know that excellent work can be done on open lists. I see suggestions to turn XML into an OOPL and quit wondering why some WGs are closed. Meanwhile, go to work, build code over components, and get results for $99 that used to cost $.5 million. It is a good day. The funniest question I've heard lately: "How many vendors of Unix boxes are left? So, then what is the difference?" len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Wed Feb 24 01:46:02 1999 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:09:23 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) References: Message-ID: <36D3593C.766B@hiwaay.net> Mark Birbeck wrote: > > Not to be tendentious, but some of us having been doing that for at > > least a generation now. SGML works. > > Hello Len, I was using my words carefully! SGML has of course been > around for a while, and been very important for document management and > generation. But in the world of software engineering we are constantly > reinventing the wheel; coming up with new file formats, parsers, > transmission mechanisms, and so on. So, in the world I'm in, XML *is* a > unique and unprecedented contribution. Hello Mark: I am being careful as well. The work being done is not as much being reinvented as building new walls and towers on a pre-existing city that was strong to begin with if not as large. The XML effort has been one of trying to make sure the original foundations are made stronger, not unique. What changed was the name and the venue. Sometimes reinvention is new masons learning old lessons. If they draw different conclusions, they should test them on existing architecture. Otherwise, we only get to sell in the bazaar. Personally, I like vaulted ceilings even with flying buttresses. Better sound, better music, more seating, wealthier patrons. len() xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Wed Feb 24 01:54:09 1999 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 17:09:23 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) References: <000901be5ee8$f9c407b0$d3228018@jabr.ne.mediaone.net> <199902232043.OAA01619@bruno.techno.com> <36D32B38.B83E72DA@sqwest.bc.ca> Message-ID: <36D35B1E.21CF@hiwaay.net> Lauren Wood wrote: > > "Steven R. Newcomb" wrote: > > > The "SGML property set" is one such property set, > > and it's the property set that governs the > > structure and nature of the objects to which > > Jade's groveoa interface provides access. Think > > of a property set as a schema for the objects that > > result from parsing (and/or from semantic > > processing, but that's another story for another > > day). > > > > The DOM is not a grove; it is an API. > > And for those interested in the history of the DOM, we started with > the SGML property set as a basis (along with the APIs implemented in > various products such as HTML browsers and SGML editors). So it's > not surprising that various APIs that deal with these sorts of > documents have a similar feel to them, though they differ in the > details. Good. The authoritative replies are so much better than the reports from the casual bystanders. ;-) So much for cane thumping. Back to the TreeView methods.... len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Wed Feb 24 02:09:02 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:09:23 2004 Subject: FW: XML Mail In-Reply-To: <36D2FE88.12FEE8E4@manhattanproject.com> Message-ID: <002001be5f99$ea3db460$d3228018@jabr.ne.mediaone.net> I have modified XMTP to wrap non text/xml bodies in within a CDATA section because this is interpreted literally as ]]>, John Cowan's suggestion of converting ]]> into ]]]]> is the one I have adopted. so, XMTP *does* recognize markup, as either XML by virtue of the content-type: text/xml or as text containing lots of < and > to be escaped. Jonathan Borden http://jabr.ne.mediaone.net > > Sorry to revive a dead thread, but I'm making progress > on the subject, so I'd like your feedback. > > To which David Megginson wrote: > > > Wrong -- or, to put it differently, it should leave content with > > text/html and text/xml alone, but it should not try to recognise > > markup in text/plain. > > Then, Parand Tony Darugar and > Jonathan Borden posted their > current implementations of such a monster. > > The debate then swung as to how to treat the "body" > of the e-mail. Most people agreed that it was some > sort of CDATA thing, where "]]>" in the text body > would become "]]>" > > My question: > > Isn't there a way to do smart stuff like #2 ? > > For example, look at Didier's post below. > It'd be slick to have the XMLMail program > recognize his markup. > > For my example, I'd like to "embeed" bookkeeping > information in my e-mail. If the e-mail leaves > the 'organization', the bookkeeping is stripped. > ... > > Then I can run all of my e-mail through my > journal, which then hits my ledger, and accounts > for my time appropriately. I'm actually very > serious about this... > > Best, > > Clark > > XML, it's not just for computers anymore. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Marc.McDonald at Design-Intelligence.com Wed Feb 24 02:29:10 1999 From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com) Date: Mon Jun 7 17:09:23 2004 Subject: Well-formed vs. valid Message-ID: Writing application code for validation is something I agree is being done and is something to avoid. The validation code is just another incarnation of the information in a DTD, i.e. both the DTD and the code in the application detect a valid document. When the structure of the content changes, 2 dissimilar descriptions must change - the DTD and the application. Neither SAX nor DOM provide any means to deal with this problem - one provides a stream of element creation calls and the other provides walking the tree to access elements. I would propose a type of XML parser that takes a well-formed or valid document, validates it against a DTD (or any other accepted form of structure description) of the application's choice, and then issues streaming events to the application. Consider it a DOM that does a tree match on an application chosen DTD and then emits SAX calls. The application would be guaranteed to be receiving valid elements and thus not need its own data validation code. The line between the application and 'XML' is currently viewed as the application is hooked onto DOM, SAX, or some other XML parser of a file at the level of elements. The XML structural description in a DTD is not used, except if the document (not the application) calls for validation. This separation is also represented by modeling on the basis of a file rather than a stream. This 'traditional' architecture (file-based, DTD for optionally ensuring file is valid) both limits the capabilities and requires writing of lots of additional application code for verification and other purposes. By allowing a stream rather than file model to be used, good things can be accomplished: 1. A site can advertise its available content with a DTD. A DTD not only describes valid form, but also the entire world of what a server may provide. 2. An application can decide what elements out of the available elements of a site are needed (via query or pattern to site) which would then respond with the desired content. Extraneous elements could be avoided by the application's choice. Rather than consider a site a mere file that can be downloaded in its entirety and providing yet another means to query a site for its available documents, the site can become an element server which advertises its elements and cooperates with the application to download only the needed elements. The concept of 'valid' under this model is more of a 'not invalid' - if the stream so far is valid, assume it will continue to be. Only closing the stream would deliver the various closing elements which (hopefully) would result in a complete valid document. It's easy enough to fall back onto a 1960s model of communication (the file) and punt the validation problems onto the application writers, but for widespread acceptance things need to be easy not difficult. Another 10 cents worth of thought into the pot, Marc B McDonald Principal Software Scientist Design Intelligence, Inc www.design-intelligence.com ---------- From: Tim Bray [SMTP:tbray@textuality.com] Sent: Monday, February 22, 1999 11:21 AM To: Jeffrey E. Sussna; 'XML-DEV' Subject: Re: Well-formed vs. valid At 10:58 AM 2/22/99 -0800, Jeffrey E. Sussna wrote: >One thing disturbs me, however. Much talk seems to be made about documents >or document fragments being useful because they are well-formed. I don't >want something well-formed, I want something "valid". Whether validity is >determined by reference to a DTD or to a schema of some other kind, I need >more than just the lowest-level syntactic conformance to the XML spec. I >need to be able to determine that the XML in question conforms to the >syntactic and semantic constraints imposed by my application. I've never seen an application so simple that its syntactic/semantic constraints could be expressed in a schema, DTD or any other flavor. That's why every commercial DBMS-based app has zillions of lines of data validation code that have to be run before you actually use incoming data. Having said that, I think that validation is a good thing and essential in lots of applications, and will become a better thing once we have a more modern schema facility. >Furthermore, I don't want to have to rely on implicit knowledge contained >within a proprietary parser in order to do so. In my experience, you *always* have to write some application-specific validation code. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Wed Feb 24 02:50:00 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:09:23 2004 Subject: Streams, protocols, documents and fragments In-Reply-To: <199902240003.SAA03997@trinkpad.valleytel.net> Message-ID: <002101be5f9f$a1606340$d3228018@jabr.ne.mediaone.net> > > > Jonathan Borden writes: > > document is defined as in the XML spec. documents are well > > formed. when a document fragment is isolated from its parent > document, it > > becomes a standalone document. > > Sounds fine so far... > > > a document may contain a prolog. a document fragment may not. a > document may > > contain a !DOCTYPE definition (DTD), a document fragment may > not. Hence all > > document fragments are legal documents but not all documents are legal > > document fragments. > > I think I follow what you are saying, but I'm confused why you would > choose to define a document fragment in this way. Why can't it > contain a prolog? because then it is a document. My sole purpose in discussing 'document fragments' was because the thread had gotten stuck on the notion that a continuous XML stream would contain a single long document (perhaps w/o a closing tag) and the actual PDU's consist of document fragments ... the point is that if we create a protocol on a stream which transmitts multiple documents, there is no loss of functionality over a solution employing 'document fragments' > Are you assuming that document fragment must be > produced as a reduction of a parent document? It strikes me as very > odd to define 'document fragment' as a superset of 'document'. to the contrary, if all legal doc frags were in fact docs then doc is a superset of doc frag ... but it has been pointed out that doc frags aren't always legal docs (when non default charsets are used). > > > So, the problem here is not one with XML, rather the protocol used to > > transmit documents, HTTP and SMTP send one MIME message per > PDU, streaming > > protocols can be defined which transmit multiple documents. > > But the definition of XML processor does become a problem here. If > the stream consists of multiple XML documents, one must use an > XML-aware processor to parse it. But this had better be a > non-conforming XML processor, since according to the spec a > 'conforming XML processor' must cry foul if its input doesn't have one > and only one root element. > It is the responsibility of the *protocol* to pass a document to the XML parser. There is no requirement that the stream be passed unadulterated to the XML parser. The suggestion of delimiting characters allows the protocol layer to easily chop the stream into individual documents. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Wed Feb 24 02:57:55 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:23 2004 Subject: Streaming XML and SAX Message-ID: <4.0.1.19990223210727.00e59d50@pop.hesketh.net> I've been following the discussions on streaming (on both XML-Dev and XSL-list, which has been interesting to compare) with lots of interest. Unfortunately, it's been through the haze of just having finished (yesterday) another book, so my observations may be tilted. I started my thinking about the subject with the XP (Extensible Protocol) proposal at the IETF: (it's about three weeks old now.) http://www.ietf.org/internet-drafts/draft-harding-extensible-protocol-00.txt (I've cc'd the author of the draft since I don't know if he's one of our lurkers, and figure he might be interested in knowing that we've got a live discussion going on here. I'm not saying XP is the cure to all our troubles, but it's a good place to start.) XP proposes a pretty simple mechanism for sending requests and responses as streams of XML documents. As the draft puts it, >To extend XML from a class of data objects into a protocol is to >extend the rules for constructing a single document into rules for >constructing two interrelated streams of documents. Accordingly, we >introduce mechanisms for handling both the sequential and >interrelated aspects of the document streams. Requests are prefaced with a processing instruction (PI) that uses the form: > RequestPI ::= '' Responses are prefaced with a PI using the form: > ResponseToPI ::= '' A 'terminator PI' is used to mark the end of a document, using the form: > TerminatorPI ::= '' It's a pretty simple mechanism, using Nmtokens to keep two streams of processing and information in sync with each other. XP doesn't directly address the issues that seem to be bedeviling this list. The issue of associating DTDs with documents, for instance, is left untouched, and the examples use simple well-formed XML. It does, however, suggest a fairly simple approach to stream processing that might be appropriate in a number of situations. Basically, rather than arguing about documents and streams and how they should relate to each other within the context of XML, maybe it's time to step outside the tight XML framework and start thinking of streams as a set of XML documents presented in some kind of sequence with meaningful delimiters. The stream itself may not be a valid or even well-formed XML document - since the end element may appear a long ways in the future, or even possibly never appear - but the stream can be decomposed into a set of valid XML documents. Some folks on this list have suggested mechanisms like control characters - ^L or ^C - to manage these streams. While that might work, it doesn't provide very much flexibility of expression. For example, it providrd no information about the relation of the documents in the stream except their sequence. In many cases, relating documents in the stream to each other - or, like XP, to an entirely separate stream - may be important. The use of processing instructions (or, if you want to be grouchy, markup that uses a PI-like syntax) seems appropriate. This might also reduce the need for preprocessing, or for parsers that look specifically for control characters, and would allow the reuse of mechanisms we've already got. A SAX parser might be able to carry out stream parsing, sending standard SAX events to multiple threads representing different document components of the stream, for example. The PIs could be sent as part of the prolog - it might mean rearranging the prolog so comes before the PI, but that I think is doable - so the application could get the information. It could give startDocument and endDocument some real work to do that isn't just the province of the first startElement and the last endElement. (Yes, I know startDocument is important for catching stuff that appears before the root element.) Defining this in a general way doesn't seem like it would be too painful. It might be a general description of a mechanism that XP applies in a particular request/response situtation, or it might be something else. In any event, defining XML streams and rules for dealing with them is an important issue, one with very important implications for interchange. If we could hammer this down, we might be able to ensure that all kinds of developers will be able to share XML streams as easily as they share XML documents. If we define streams cleanly, we might even be able to nest streams within streams (hopefully) avoiding the next round up of multiple-container processing battles. It'd be worth fleshing out, and I could see adding two new events to SAX - beginStream and endStream or something like that. On the other hand, maybe I've just been working too hard too long and it's time for a nice long vacation. If folks thinks this is worthwhile, though, I'd be happy to put some work into it. Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Wed Feb 24 03:06:34 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:09:23 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) In-Reply-To: <199902232043.OAA01619@bruno.techno.com> Message-ID: <002201be5fa1$f23f4a40$d3228018@jabr.ne.mediaone.net> Steven R. Newcomb wrote: > > I certainly can't speak for James, but I would > like to clear something up: > > ISO/IEC 10744:1997 ("HyTime") and, earlier, > ISO/IEC 10179:1996 ("DSSSL") defined and coined > the term "grove". A grove is the set of objects > resulting from parsing an information resource in > some specific notation. EVERY GROVE ALWAYS MUST > CONFORM TO A FORMAL MODEL called a "property set". > The "SGML property set" is one such property set, > and it's the property set that governs the > structure and nature of the objects to which > Jade's groveoa interface provides access. Think > of a property set as a schema for the objects that > result from parsing (and/or from semantic > processing, but that's another story for another > day). > Thanks for clearing this up succincly. I suppose I have the inappropriate tendency to read code more often than specs, so the source of my suppositions has been from projects like Jade rather than the true source. > The DOM is not a grove; it is an API. Until the > XML information set is stable, the DOM is an API > to something that's not rigorously defined. The > DOM can be implemented as an interface to XML > groves, but not before there are XML groves. And > there can't be XML groves until there's a property > set for XML. (Well, no, that's not quite right, > because we routinely make SGML groves from XML > documents. But that's just a temporary kludge > that only works because of XML's SGML parentage. > Moreover, the SGML Property Set provides for more > complexity than XML groves will ever need to have, > and simplicity is one of XML's most important > virtues.) Interesting. Is the DOM just an API, or a set of interfaces which define a heirarchy of objects? That is, an IDL interface definition is a type of formal definition. For example, using Microsoft's COM, IDL compiles into a 'typelibrary' which is a binary representation of something analogous on some level to a property set. A scripting language, to continue the example, can employ any object with a typelibrary and associated interface set. Would it be possible to generate IDL programmatically from a property set definition? If so, then aren't the two alternate representations of the same information? > > There is every reason to believe that the XML > Information Set, once Recommended, will be > expressible as a property set. Once this is done, > XML objects will be processable, addressable, and > re-usable via the same software that supports the > processing, addressing, and re-use of components > of resources expressed in other notations, with > each such notation described by its own property > set. In that scenario, all information components > conform to the same object model, the ISO "grove" > object model, so we are able to address (link, > re-use) any kind of thing. > This is the same language used to describe the virtues of COM. Not to denegrate the virtues of property sets, but why are property sets specifications, and IDL definitions an API? Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Feb 24 14:06:14 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:23 2004 Subject: Streaming XML and SAX In-Reply-To: <4.0.1.19990223210727.00e59d50@pop.hesketh.net> References: <4.0.1.19990223210727.00e59d50@pop.hesketh.net> Message-ID: <14036.1186.399749.89131@localhost.localdomain> Simon St.Laurent writes: > Some folks on this list have suggested mechanisms like control > characters - ^L or ^C - to manage these streams. While that might > work, it doesn't provide very much flexibility of expression. For > example, it providrd no information about the relation of the > documents in the stream except their sequence. In many cases, > relating documents in the stream to each other - or, like XP, to an > entirely separate stream - may be important. The use of processing > instructions (or, if you want to be grouchy, markup that uses a > PI-like syntax) seems appropriate. Layering doesn't work unless each layer is as simple as possible: 1. Use a non-XML mechanism for separating XML packets -- that way, there's not a tight dependency between the stream-handler and the parser (the stream handler knows the bounds of each packet without doing any XML parsing). 2. Separate information about the packages from the packets themselves. The information could be linear, or it could itself be XML packets of a different sort. You should not have to parse an entire packet to know its sequencing, etc. 3. Don't require the main packet to be XML -- it might often make sense to send binary information such as video and audio clips as well (of course, it will probably have to be base64 encoded, but that's a separate issue). Putting a PI in the XML packet itself seems a little awkward to me. I'd rather have something like the following: ^C Packet 1.0 Seq=345 Stream=24232 ^L yyy ^C or, if you prefer ^C 345 24232 ^L yyy ^C That way, the main socket layer can just scan for ^C and ^L and then pass the appropriate chunks (the packet info and the packet proper) off to more specialised layers for processing. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Wed Feb 24 15:09:14 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:09:23 2004 Subject: Streams, protocols, documents and fragments Message-ID: > From: Borden, Jonathan [SMTP:jborden@mediaone.net] > My sole purpose in discussing 'document > fragments' was because the thread had gotten stuck on the notion that > a > continuous XML stream would contain a single long document (perhaps > w/o a > closing tag) and the actual PDU's consist of document fragments ... > the > point is that if we create a protocol on a stream which transmitts > multiple > documents, there is no loss of functionality over a solution employing > 'document fragments' > I agree with this. And the point I was trying to get to was that therefore we don't need to introduce loads of terms on top of XML 1.0 to understand the concepts. I still think all of this is being over-complicated - but then maybe I'm the one who's missing something, so let's see. I don't follow why so many suggestions to resolving this problem involve stepping 'outside of' XML 1.0. We have suggestions for sync characters like ^C and ^L, we have the proposal that XML 1.0 should be fundamentally altered to allow the concept of a 'not well-formed' document (or one that may *become* well-formed at some point in the future), we have proposals for documents that contain subsets of validity. All of these suggestions seem to go against the grain of what XML is about. XML 1.0 already copes with streams and files. A physical XML document is a linear sequence of characters conforming to certain rules. You can't tell whether those rules have been met until you have received the entire sequence of characters. You know when you've reached the end by the closing tag. That's it! There's not much else you can do about it, because that's what XML is all about - well-formed, possibly validated documents conforming to certain rules. Now, the fact that the beginning and end of this sequence of characters may be presented to the parser eight hours apart is to me an application problem. If someone has a document that takes eight hours to arrive then maybe they should re-think how they're setting the system up. If it's a massive document that can only be processed in its entirety, and if any part fails to arrive the whole document fails, then sure, you have to go ahead and send it over eight hours. But the stock ticker example is not like this. If I miss the stock price for Microsoft at 11am, then I can still make use of the stock price for Microsoft at 11.20am. It will affect my historical archives, but at least I have something to display. It is not an 'all or nothing' situation. So, accepting for a moment that we should transmit many documents throughout the day, rather than one big one, it leaves the question of demarcation. And here I'm surprised that people want to step outside of XML to find a solution. Say we send the following: ^L MSFT 1000 ^L ICI 1010 ^L If the data link is 100% reliable then we have encoded redundant information because the document name - the element for stockPrice - already tells us where one starts and ends. So, we don't need the ^L. But if the data link *isn't* reliable then adding a few ^L characters doesn't help a lot, because if we lose the following sequence we have no way of knowing: 1000 ^L ICI If this sequence is taken out of the above two documents then you now have the wrong price for Microsoft and nothing for ICI, and your application is none the wiser. I think if 100% data reliability is required then we need a few streaming-related attributes that we can add to our documents, such as: MSFT 1000 ICI 1010 These would be added by a 'sending' application as a separate layer to the original document generation, and would allow the receiving application to process all the 'streamns' packets before actually processing the nodes - say, storing or displaying the stock prices. You could remove 'invalid' nodes from the tree (well-formed at the XML level, but with the wrong packet ID), and then while your main application is getting on and acting on the stock data, the receiving process could be re-requesting the lost data. In the illustration above, after losing the packet, we would now have: MSFT 1010 <--- error here and the 'streamns' processing would spot and re-request the missing data easily (both packet 55 and packet 56). To be honest, I'm not suggesting what I've said here as some new standard. There are lots of ways what I've described could be achieved, for example: MSFT 1000 ICI 1010 takes up less space, and would still spot the same errors. I'm just trying to illustrate how solutions can be found that don't involve smashing XML 1.0 to bits. At the end of the day this is an application problem, not an XML one. Regards, Mark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Wed Feb 24 15:29:16 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:09:23 2004 Subject: Documents and Document Fragments Message-ID: > -----Original Message----- > From: Nathan Kurz [SMTP:nate@valleytel.net] > Sent: Tuesday, February 23, 1999 11:13 PM > To: xml-dev@ic.ac.uk > Subject: RE: Documents and Document Fragments > > Mark Birbeck wrote: > > It's also relevant to document fragments. In previous posts, I was > > trying to say that as far as a parser is concerned, whether it > receives > > a complete XML document by retrieving a file from a disk, a page > from a > > web server, or four nodes from an object database is neither here > nor > > there. As far as it is concerned, it has an 'XML document'. I called > > this a 'logical' document because I wanted to indicate that it may > not > > actually exist in any physical form, but it is a > > 'data-object-that-conforms' item, and that if we can process an 'XML > > document' we can process one node, many nodes or the whole tree. You > > don't then need to devise another system to process well-formed > > 'uberdocuments', and yet another to process well-formed 'document > > fragments' or 'microdocuments' or whatever. > > Although it may reflect the state of existing parsers, I disagree with > this assessment of how XML parsers must relate to 'XML documents' and > 'document fragments'. It seems like it has things backwards. You > imply that if a parser is able to process a collection of nodes in one > particular form, that it is able to process a collection of nodes in > any arrangement whatsoever. Perhaps, but not necessarily. > I think that 'document fragment' *is* a useful term once you are 'inside' the parser. In other words, when you get to the point where you are processing the physical XML document and want to discuss aspects of this it is helpful to draw a distinction between the entire document and pieces of it. I don't think it helps though in the prior process of delivery of information to the parser. As far as I can see in XML 1.0, you can only deliver a well-formed document to a parser. And even if you have a database of lots of nodes that you combine to make into XML documents, you've still presented 'complete' documents to the parser, not 'fragments' or 'microdocuments'. In all my comments, my objection is always the seeming willingness of everyone to introduce extra terminology to supposedly 'clarify', when it is unnecessary, and often confuses. Regards, Mark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From srn at techno.com Wed Feb 24 15:48:29 1999 From: srn at techno.com (Steven R. Newcomb) Date: Mon Jun 7 17:09:24 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) In-Reply-To: <002201be5fa1$f23f4a40$d3228018@jabr.ne.mediaone.net> (jborden@mediaone.net) References: <002201be5fa1$f23f4a40$d3228018@jabr.ne.mediaone.net> Message-ID: <199902241547.JAA02865@bruno.techno.com> [Jonathan Borden:] > Interesting. Is the DOM just an API, or a set of > interfaces which define a hierarchy of objects? > That is, an IDL interface definition is a type > of formal definition. For example, using > Microsoft's COM, IDL compiles into a > 'typelibrary' which is a binary representation > of something analogous on some level to a > property set. A scripting language, to continue > the example, can employ any object with a > typelibrary and associated interface set. > Would it be possible to generate IDL > programmatically from a property set definition? Even though I'm not well versed in IDL, I feel certain the answer is "Yes". Rigorous machine-readable expressions of information models, such as property sets, are extremely useful things. > If so, then aren't the two alternate > representations of the same information? No, not really. An interface specification is about a particular way of accessing information. Starting with an IDL specification, one could make inferences about the nature of the information being accessed, but that's about all. The difference between an API and an information model is subtle, but they are really very different things. One way to think of the difference is this: * an information model is about what *is*, while * an API is about what one *does*. This same difference goes to the very heart of what makes XML great. XML is emblematic of industry's ongoing historic shift of focus away from processing, and toward the information that is being processed. > > There is every reason to believe that the XML > > Information Set, once Recommended, will be > > expressible as a property set. Once this is done, > > XML objects will be processable, addressable, and > > re-usable via the same software that supports the > > processing, addressing, and re-use of components > > of resources expressed in other notations, with > > each such notation described by its own property > > set. In that scenario, all information components > > conform to the same object model, the ISO "grove" > > object model, so we are able to address (link, > > re-use) any kind of thing. > This is the same language used to describe > the virtues of COM. Not to denegrate the virtues > of property sets, but why are property sets > specifications, and IDL definitions an API? (It doesn't matter to me whether we call property sets or APIs "definitions" or "specifications".) Property sets are about what underlies APIs. An IDL definition is a particular way to get at some information. Another kind of API definition would do equally well for the same information set. Property sets are about what both such APIs would have in common. Property sets are more abstract than APIs. (COM is about a particular software vendor's attempts to retain its position of having its proprietary software products be indispensable to users. COM's intent is 180 degrees removed from the intent of property sets. The intent of property sets is to protect the interests of information owners and users, and to re-focus the attention of software vendors on the interests of such information owners and users, instead of on finding creative ways to lock the customer into particular software product lines.) -Steve -- Steven R. Newcomb, President, TechnoTeacher, Inc. srn@techno.com http://www.techno.com ftp.techno.com voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137) fax +1 972 994 0087 (at ISOGEN: +1 214 953 3152) 3615 Tanner Lane Richardson, Texas 75082-2618 USA xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Wed Feb 24 16:04:33 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:09:24 2004 Subject: Streams, protocols, documents and fragments In-Reply-To: Message-ID: <000301be600d$fc00e130$d3228018@jabr.ne.mediaone.net> > I still think all of this is being over-complicated - but then > maybe I'm the one who's missing something, so let's see. > > I don't follow why so many suggestions to resolving this problem > involve stepping 'outside of' XML 1.0. We have suggestions for sync > characters like ^C and ^L, we have the proposal that XML 1.0 should be > fundamentally altered to allow the concept of a 'not well-formed' > document (or one that may *become* well-formed at some point in the > future), we have proposals for documents that contain subsets of > validity. All of these suggestions seem to go against the grain of what > XML is about. > Mark, I think it is just a simple implementation issue. We have XML parsers whose model is one XML document per stream (e.g. java inputStream). A multi-doc protocol can chop the stream into sub-streams to be passed to the XML parser. By employing an external delimiter, the protocol doesn't need to understand or parse the XML itself in order to detect "end-of-doc" even though the logical information is there. Otherwise the protocol implementation needs to understand XML syntax in order to detect "end-of-doc" Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Tim.Shaw at wdr.com Wed Feb 24 16:14:56 1999 From: Tim.Shaw at wdr.com (Tim.Shaw@wdr.com) Date: Mon Jun 7 17:09:24 2004 Subject: Streams, protocols, documents and fragments In-Reply-To: Message-ID: I agree with the arguments so far - just send lots of little documents, and the protocol is just a layer on top, to be removed by the input stream processor. But, isn't the example below not wf XML - it doesn't seem to have a prolog? I have no problem with that either - again, you need a client side stream processor to pick apart the XML ... what do I call them? FSA 'chunks' ... chunks and, using some client side determination, add the prolog - and then pass it to the XML parser as a WF (and hopefully valid) XML document. This is 'trivial', and interleaving the protocol stuff is no great problem (plenty of examples, and I've done it at least 5 times for different socket-based systems). My concern tho' is that we require a piece of Client-side stream processing logic to pick up the XML 'chunks' and convert them to Valid WF XML - and this is not standard (read 'generally agreed' to avoid mention of inertia). Fun tho' tim ______________________________ Reply Separator _________________________________ Subject: RE: Streams, protocols, documents and fragments Author: Mark.Birbeck (Mark.Birbeck@iedigital.net) at unix,mime Date: 24/02/99 15:18 > From: Borden, Jonathan [SMTP:jborden@mediaone.net] > My sole purpose in discussing 'document > fragments' was because the thread had gotten stuck on the notion that > a > continuous XML stream would contain a single long document (perhaps > w/o a > closing tag) and the actual PDU's consist of document fragments ... > the > point is that if we create a protocol on a stream which transmitts > multiple > documents, there is no loss of functionality over a solution employing > 'document fragments' > I agree with this. And the point I was trying to get to was that therefore we don't need to introduce loads of terms on top of XML 1.0 to understand the concepts. I still think all of this is being over-complicated - but then maybe I'm the one who's missing something, so let's see. I don't follow why so many suggestions to resolving this problem involve stepping 'outside of' XML 1.0. We have suggestions for sync characters like ^C and ^L, we have the proposal that XML 1.0 should be fundamentally altered to allow the concept of a 'not well-formed' document (or one that may *become* well-formed at some point in the future), we have proposals for documents that contain subsets of validity. All of these suggestions seem to go against the grain of what XML is about. XML 1.0 already copes with streams and files. A physical XML document is a linear sequence of characters conforming to certain rules. You can't tell whether those rules have been met until you have received the entire sequence of characters. You know when you've reached the end by the closing tag. That's it! There's not much else you can do about it, because that's what XML is all about - well-formed, possibly validated documents conforming to certain rules. Now, the fact that the beginning and end of this sequence of characters may be presented to the parser eight hours apart is to me an application problem. If someone has a document that takes eight hours to arrive then maybe they should re-think how they're setting the system up. If it's a massive document that can only be processed in its entirety, and if any part fails to arrive the whole document fails, then sure, you have to go ahead and send it over eight hours. But the stock ticker example is not like this. If I miss the stock price for Microsoft at 11am, then I can still make use of the stock price for Microsoft at 11.20am. It will affect my historical archives, but at least I have something to display. It is not an 'all or nothing' situation. So, accepting for a moment that we should transmit many documents throughout the day, rather than one big one, it leaves the question of demarcation. And here I'm surprised that people want to step outside of XML to find a solution. Say we send the following: ^L MSFT 1000 ^L ICI 1010 ^L < Protocol stuff snipped > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From gkholman at CraneSoftwrights.com Wed Feb 24 16:29:42 1999 From: gkholman at CraneSoftwrights.com (G. Ken Holman) Date: Mon Jun 7 17:09:24 2004 Subject: ANNOUNCE: XML Conformance Public Information Page In-Reply-To: <3.0.5.32.19990223142413.00ac65a0@pophost.fsc.fujitsu.com> References: Message-ID: At 99/02/23 14:24 -0500, Ralph Ferris wrote: >At 01:32 PM 2/23/99 -0500, Didier PH Martin wrote: >>In the page, it is said that the conformance report will be a xml document >>(I didn't expected less) and that it will be processed with DSSSL. >> >> >>Does this document has a particular DTD is it a public one? will it be >... > >I have the same question. "Processed using DSSSL" could mean running it >through Jade to produce HTML (or RTF for that matter). Yes, and eventually yes. That is the process I've established for producing the report from the raw data. Once we have honed the HTML into the desired presentation I'll throw together the printing script to get hardcopy as well. >If the output is in >fact HTML, I'd still like to see it available as an XML document. I will pass this request on to the committee. >"HyBrick" >supports catalogs, so a public identifier for the DTD, together with the >required DTDs to support the DSSSL architecture, would work. It is all there, but under review due to changing requirements of the committee as our work evolves. The DTD is *very* content-oriented (to the point of being too hard to follow for some who have seen it) and the DSSSL scripts do a *lot* of munging to produce a number of indexes and cross references (but I'm pleased with the end result). But it is all changing as our needs are changing. Let me see what I can do. Thanks, Ralph! See you in San Jose! ........ Ken -- G. Ken Holman mailto:gkholman@CraneSoftwrights.com Crane Softwrights Ltd. http://www.CraneSoftwrights.com/m/ Box 266, Kars, Ontario CANADA K0A-2E0 +1(613)489-0999 (Fax:-0995) Website: XSL/XML/DSSSL/SGML services outline, XSL/DSSSL shareware, stylesheet resource library, conference training schedule, commercial stylesheet training materials, on-line XSL CBT. Next instructor-led XSL Training: X-Tech:1999-03-07 WWW8:1999-05-11 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Tim.Shaw at wdr.com Wed Feb 24 16:36:38 1999 From: Tim.Shaw at wdr.com (Tim.Shaw@wdr.com) Date: Mon Jun 7 17:09:24 2004 Subject: Documents and Document Fragments In-Reply-To: Message-ID: I maybe don't quite understand this - my apologies if I'm missing something. A document fragment is a lightweight document (as defined by W3C REC-DOM-Level-1-19981001). These fragments may be used for numerous purposes - including creating (by 'insertion') other documents. A document fragment need not be WF - but (presumably) they must represent at least one type construct as they are also Nodes Surely it's down to the parser as to whether you can access these things before the XML document has been fully parsed (and they are _not_ Valid XML 1.0 documents(?)). The parser can still be XML 1.0 conformant - but it would need to provide non-conformant interfaces to allow interim access. NB This is somewhat DOM-centric, but the definitions do exist :-) tim ______________________________ Reply Separator _________________________________ Subject: RE: Documents and Document Fragments Author: Mark.Birbeck (Mark.Birbeck@iedigital.net) at unix,mime Date: 24/02/99 15:38 > -----Original Message----- > From: Nathan Kurz [SMTP:nate@valleytel.net] > Sent: Tuesday, February 23, 1999 11:13 PM > To: xml-dev@ic.ac.uk > Subject: RE: Documents and Document Fragments > > Mark Birbeck wrote: > > It's also relevant to document fragments. In previous posts, I was > > trying to say that as far as a parser is concerned, whether it > receives > > a complete XML document by retrieving a file from a disk, a page > from a > > web server, or four nodes from an object database is neither here > nor > > there. As far as it is concerned, it has an 'XML document'. I called > > this a 'logical' document because I wanted to indicate that it may > not > > actually exist in any physical form, but it is a > > 'data-object-that-conforms' item, and that if we can process an 'XML > > document' we can process one node, many nodes or the whole tree. You > > don't then need to devise another system to process well-formed > > 'uberdocuments', and yet another to process well-formed 'document > > fragments' or 'microdocuments' or whatever. > > Although it may reflect the state of existing parsers, I disagree with > this assessment of how XML parsers must relate to 'XML documents' and > 'document fragments'. It seems like it has things backwards. You > imply that if a parser is able to process a collection of nodes in one > particular form, that it is able to process a collection of nodes in > any arrangement whatsoever. Perhaps, but not necessarily. > I think that 'document fragment' *is* a useful term once you are 'inside' the parser. In other words, when you get to the point where you are processing the physical XML document and want to discuss aspects of this it is helpful to draw a distinction between the entire document and pieces of it. I don't think it helps though in the prior process of delivery of information to the parser. As far as I can see in XML 1.0, you can only deliver a well-formed document to a parser. And even if you have a database of lots of nodes that you combine to make into XML documents, you've still presented 'complete' documents to the parser, not 'fragments' or 'microdocuments'. In all my comments, my objection is always the seeming willingness of everyone to introduce extra terminology to supposedly 'clarify', when it is unnecessary, and often confuses. Regards, Mark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981 -02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Wed Feb 24 16:54:21 1999 From: jborden at mediaone.net (Borden, Jonathan) Date: Mon Jun 7 17:09:24 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) In-Reply-To: <199902241547.JAA02865@bruno.techno.com> Message-ID: <000401be6015$6a2a2840$d3228018@jabr.ne.mediaone.net> Steven R. Newcomb wrote: > > > If so, then aren't the two alternate > > representations of the same information? > > No, not really. An interface specification is > about a particular way of accessing information. > Starting with an IDL specification, one could make > inferences about the nature of the information > being accessed, but that's about all. The > difference between an API and an information model > is subtle, but they are really very different > things. One way to think of the difference is > this: > > * an information model is about what *is*, while > > * an API is about what one *does*. Fair enough, but an IDL interface has the concept of "methods" which specify what an object does, as well as "properties" which specify what an object is. IDL has historically been used to specify RPC interfaces, yet increasingly finds use to describe local, in-process, object interfaces. There is a straightforward transformation of a property, e.g. int x; into a pair of 'methods' (slots in the interface): int getX(); void putX(int val); ... this is basic stuff, but the point is to emphasize that the distinction between what an object 'does' and what an object 'is' is not so clearcut. > > This same difference goes to the very heart of > what makes XML great. XML is emblematic of > industry's ongoing historic shift of focus away > from processing, and toward the information that > is being processed. There are those of us whose bias toward 'program is data' is not new. XSL is the latest greatest example of the replacement of algorithmic bits with declarative bits. > > > This is the same language used to describe > > the virtues of COM. Not to denegrate the virtues > > of property sets, but why are property sets > > specifications, and IDL definitions an API? > > (COM is about a particular software vendor's > attempts to retain its position of having its > proprietary software products be indispensable to > users. COM's intent is 180 degrees removed from > the intent of property sets. The intent of > property sets is to protect the interests of > information owners and users, and to re-focus the > attention of software vendors on the interests of > such information owners and users, instead of on > finding creative ways to lock the customer into > particular software product lines.) Err, except that the Mozilla team has seen fit to develop XPCOM (cross-platform component object model) as the foundation for object interoperability, so clearly the concepts of IDL and typelibraries go beyond Microsoft's proprietary goals. Moreover, the object model of COM is strikingly similar to that of Java, and when we get down to it CORBA isn't all that different. We need to abstract the concepts away from the nitty gritty implemementation details. The concept I am pushing is that IDL is a text based 'formal' description of objects which can be 'compiled' into a binary representation: a typelibrary. This information can be used to develop cross platform, cross machine, cross language object interoperability. This has been done, and works quite well, it is not a future goal. So, the DOM team has seen fit to use IDL as the formal definition (not MS IDL BTW, so this isn't a proprietary conspiracy). I laud this descision. This decision has enabled cross language, cross machine and cross platform parsing manipulation and exchange of XML derived information. There are Java, C++, Python and Javascript based DOM implementations that I am aware of, and these implementations use CORBA, DCOM and RMI as remote transports. On my machine, these interoperate with HTML DOM interfaces and SGML grove plan interfaces (i.e. Jade). This is a pretty strong argument for interoperability. In fact, when you get out of the SGML/XML world, the use of the terms 'property set' and 'grove' get replaced by terms 'UML', 'persistence' and 'object model'. What you promise that use of property sets and grove plans will automate processing of data and interoperability, CASE tools vendors promise using UML. What is the essence of the difference between an information set and/or property set and/or grove plan versus UML? Don't get me wrong, I think the work on information sets, property sets and groves is terrific and needs to be continued. One way to do this is to turn our heads sideways ever so often to see what collegues in the distributed object world are doing. These problems are universal. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Wed Feb 24 17:28:52 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:09:24 2004 Subject: Streams, protocols, documents and fragments Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A164D@eukbant101.ericsson.se> > -----Original Message----- > From: Borden, Jonathan [SMTP:jborden@mediaone.net] > > > Mark, I think it is just a simple implementation issue. We have XML > parsers > whose model is one XML document per stream (e.g. java inputStream). A > multi-doc protocol can chop the stream into sub-streams to be passed to > the > XML parser. By employing an external delimiter, the protocol doesn't need > to > understand or parse the XML itself in order to detect "end-of-doc" even > though the logical information is there. Otherwise the protocol > implementation needs to understand XML syntax in order to detect > "end-of-doc" > > Perl's XML::Parser (and by logic I assume this comes from expat) already does this - you can attach an IO handle (which can be an IO::Socket if you want it to come from a socket) to the XML parser, tell it what the stream delimiter is (e.g. ^L) and leave it merrily parsing XML. You have to tell it to parse again when it stops, because that's how it says "I've reached the end of one stream" - but it works fine. This is probably true of other languages that use expat to parse XML for them. Matt. -- http://come.to/fastnet Perl on Win32, PerlScript, ASP, Database, XML GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Wed Feb 24 17:29:26 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:09:24 2004 Subject: Streams, protocols, documents and fragments Message-ID: > -----Original Message----- > From: Tim.Shaw@wdr.com [SMTP:Tim.Shaw@wdr.com] > I agree with the arguments so far - just send lots of little > documents, and the protocol is just a layer on top, to be removed > by > the input stream processor. > > But, isn't the example below not wf XML - it doesn't seem to have > a > prolog? > You don't need a prolog to be well-formed ... > > I have no problem with that either - again, you need a client > side > stream processor to pick apart the XML ... what do I call them? > FSA > 'chunks' ... chunks and, using some client side determination, > add the > prolog - and then pass it to the XML parser as a WF (and > hopefully > valid) XML document. > > This is 'trivial', and interleaving the protocol stuff is no > great > problem (plenty of examples, and I've done it at least 5 times > for > different socket-based systems). > > My concern tho' is that we require a piece of Client-side stream > processing logic to pick up the XML 'chunks' and convert them to > Valid > WF XML - and this is not standard (read 'generally agreed' to > avoid > mention of inertia). > ... so you don't need to 'create' a document from the packets. However, I don't see any reason why we can't include prolog information in this model, if, for example, you need a DTD for the packets. > Regards, Mark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Feb 24 17:46:17 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:24 2004 Subject: "Empty" Text Nodes References: <199902241557.QAA02906@mail.informatik.hu-berlin.de> <36D42B4A.384CF9C1@trendline.co.il> Message-ID: <36D43A8B.57E3D8D5@locke.ccil.org> Arkin wrote: > 1. An HTML processor is a very specific case of an XML processor As of now, HTML is not XML at all. > 2. PRE, STYLE and SCRIPT are specific cases in HTML, unlike other > elements. STYLE and SCRIPT are so-called "CDATA content elements" (for which there is no XML equivalent; the term "CDATA" here is not synonymous with "CDATA" as an attribute type, or with CDATA sections). They are terminated by the sequence " 4. With a validating XML processor, XML elements should preserve > whitespaces only if the 'xml:space' attribute has a value of 'preserve', > otherwise they may lose whitespaces by ignoring the trailing and leading > whitespaces and consolidating multiple whitespaces to a single space > (). Again, whitespace is assumed to be for human readbility. This behavior is performed by the application: a conforming processor may not do it. In attribute values, OTOH, a conforming processor must do it for attributes that are not CDATA. > 5. With a validating XML processor, XML elements that have non-mixed > content type (only elements, no text) should ignore all whitespaces and > flag an error for any other text that appears in between elements. XML processors cannot just ignore that whitespace: they must report it to the application, which is then free to ignore it and typically does. > 6. Without a validating XML processor, XML elements should attempt to > ignore as much whitespace as possible, regarding it as human readable > whitespace. That totally depends on the application. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Feb 24 17:49:45 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:24 2004 Subject: Streams, protocols, documents and fragments References: Message-ID: <36D43B19.C03EF55C@locke.ccil.org> Tim.Shaw@wdr.com wrote: > But, isn't the example below not wf XML - it doesn't seem to have a > prolog? Unless the encoding stream is something other than UTF-8 (which includes ASCII as a subset) or UTF-16, it is not necessary to include an XML declaration with each document. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Wed Feb 24 18:14:34 1999 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:09:24 2004 Subject: ANNOUNCE: New XHTML WD Message-ID: <00ca01be6021$4f704e60$2eaddccf@ix.netcom.com> The new working draft for XHTML is available at http://www.w3.org/TR/1999/WD-html-in-xml-19990224/ Frank Frank Boumphrey XML and style sheet info at Http://www.hypermedic.com/style/index.htm Author: - Professional Style Sheets for HTML and XML http://www.wrox.com CoAuthor: XML applications from Wrox Press, www.wrox.com Author: Using XML on the Web (March) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Wed Feb 24 18:33:38 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:09:25 2004 Subject: ANNOUNCE: New XHTML WD In-Reply-To: <00ca01be6021$4f704e60$2eaddccf@ix.netcom.com> Message-ID: <000801be6022$f1fee140$d3228018@jabr.ne.mediaone.net> excellent. one point: there is no reason to define text/xhtml as opposed to using text/xml and inserting a DOCTYPE or perhaps a default xmlns definition. If a user-agent needs to know the DOCTYPE ... look at it! Jonathan Borden http://jabr.ne.mediaone.net > > > The new working draft for XHTML is available at > > http://www.w3.org/TR/1999/WD-html-in-xml-19990224/ > > Frank > Frank Boumphrey > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Wed Feb 24 18:48:17 1999 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 17:09:25 2004 Subject: More XML Identity Angst Message-ID: <5BF896CAFE8DD111812400805F1991F708AAF058@RED-MSG-08> In a thread about the relative importance or unimportance of markup, David Megginson wrote "Markup *is* the essential characteristic of XML -- XML is a markup standard that describes how to represent a hierarchical structure in a linear sequence of characters." I agree with the main thrust of this, that markup is essential to XML. And not to disagree with Dave, but to amplify his comments, XML with its markup and other facilities provides valuable structure in addition to tree-structuring. As others have pointed out, through ids and idrefs, structures can be represented that are not merely trees, but are directed, labeled graphs. Entities and notations also extend what is expressed. Etc. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From daniela at cnet.com Wed Feb 24 19:16:02 1999 From: daniela at cnet.com (Daniel Austin) Date: Mon Jun 7 17:09:25 2004 Subject: FW: ANNOUNCE: New XHTML WD Message-ID: <77A952A6B467D211855D00805F9521F11492CE@cnet10.cnet.com> > -----Original Message----- > From: Daniel Austin > Sent: Wednesday, February 24, 1999 11:12 AM > To: 'Jonathan Borden' > Subject: RE: ANNOUNCE: New XHTML WD > > > Hi Jonathan, > > Thanks for your comment. It would be of great help to > everyone if all comments on this WD were sent to > the editors list as well. The draft says: > > Please send detailed comments on this document to > www-html-editor@w3.org. We cannot guarantee a personal > response, but we will try when it is appropriate. Public > discussion on HTML features takes place on the mailing list > www-html@w3.org. > > Comments posted just to this group are less likely to be > addressed by the Working Group. > > Regards, > > D- > > > -----Original Message----- > > From: Jonathan Borden [mailto:jborden@mediaone.net] > > Sent: Wednesday, February 24, 1999 10:25 AM > > To: Frank Boumphrey; xml mailing list > > Subject: RE: ANNOUNCE: New XHTML WD > > > > > > excellent. one point: there is no reason to define text/xhtml > > as opposed to > > using text/xml and inserting a DOCTYPE or perhaps a default xmlns > > definition. If a user-agent needs to know the DOCTYPE ... > look at it! > > > > Jonathan Borden > > http://jabr.ne.mediaone.net > > > > > > > > > > > The new working draft for XHTML is available at > > > > > > http://www.w3.org/TR/1999/WD-html-in-xml-19990224/ > > > > > > Frank > > > Frank Boumphrey > > > > > > > > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and > on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the > following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lauren at sqwest.bc.ca Wed Feb 24 19:18:57 1999 From: lauren at sqwest.bc.ca (Lauren Wood) Date: Mon Jun 7 17:09:25 2004 Subject: Documents and Document Fragments References: Message-ID: <36D4505E.4B0573CF@sqwest.bc.ca> Tim.Shaw@wdr.com wrote: > A document fragment is a lightweight document (as defined by W3C > REC-DOM-Level-1-19981001). These fragments may be used for numerous purposes - > including creating (by 'insertion') other documents. > > A document fragment need not be WF - but (presumably) they must represent at > least one type construct as they are also Nodes No; the DOM spec defines its document fragment as having the same potential content as a parsed entity, i.e. some mixture of text, elements, comments, PIs, etc. It does not need to have a root element, or indeed any elements at all. It could just consist of one comment, or some text. regards, Lauren xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Wed Feb 24 19:37:18 1999 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:09:25 2004 Subject: ANNOUNCE: New XHTML WD Message-ID: <002d01be602c$a7714b40$2eaddccf@ix.netcom.com> one point: there is no reason to define text/xhtml as opposed to >using text/xml and inserting a DOCTYPE or perhaps a default xmlns >definition. If a user-agent needs to know the DOCTYPE ... look at it! The problem arises with slim clients such as a cell phone. To look at the DOCTYPE the document must be downloaded, and cracked open. Rather a waste of both Band width and resources. For example a cell phone may elect not to accept a SGML based HTML document, and may not know how to present an XML document. Style sheets may not be appropriate for some clients DITTO if one wants to use a namespace. What we need is some kind of content negotiation between the client and the server, and the HTTP header is obviously the way to do this. The problem with using a specific text/xhtml mime is that it sets a dangerous precedent, and could lead to a potential explosion of mime types. However some kind of negotiation must be performed. Without giving any thing away I think I can say that there has been a spirited debate on the subject, and as is pointed out in the WD, comment and suggestions on this subject is particularly welcome. Regards, Frank Frank Boumphrey XML and style sheet info at Http://www.hypermedic.com/style/index.htm Author: - Professional Style Sheets for HTML and XML http://www.wrox.com CoAuthor: XML applications from Wrox Press, www.wrox.com Author: Using XML on the Web (March) ----- Original Message ----- From: Jonathan Borden To: Frank Boumphrey ; xml mailing list Sent: Wednesday, February 24, 1999 1:24 PM Subject: RE: ANNOUNCE: New XHTML WD >excellent. one point: there is no reason to define text/xhtml as opposed to >using text/xml and inserting a DOCTYPE or perhaps a default xmlns >definition. If a user-agent needs to know the DOCTYPE ... look at it! > >Jonathan Borden >http://jabr.ne.mediaone.net > >> >> >> The new working draft for XHTML is available at >> >> http://www.w3.org/TR/1999/WD-html-in-xml-19990224/ >> >> Frank >> Frank Boumphrey >> > > >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; >(un)subscribe xml-dev >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; >subscribe xml-dev-digest >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Feb 24 19:51:43 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:25 2004 Subject: MIME types vs. DOCTYPE (was RE: ANNOUNCE: New XHTML WD) In-Reply-To: <000801be6022$f1fee140$d3228018@jabr.ne.mediaone.net> References: <00ca01be6021$4f704e60$2eaddccf@ix.netcom.com> <000801be6022$f1fee140$d3228018@jabr.ne.mediaone.net> Message-ID: <14036.21688.228887.515457@localhost.localdomain> Jonathan Borden writes: > excellent. one point: there is no reason to define text/xhtml as > opposed to using text/xml and inserting a DOCTYPE or perhaps a > default xmlns definition. If a user-agent needs to know the DOCTYPE > ... look at it! Unfortunately, that doesn't work at all -- all DOCTYPE gives me is the name of the root element, optionally accompanied by an internal DTD subset and identifiers for an external DTD subset. The name of the root element is locally-scoped to the document itself, so it's useless for type discovery (what if my document type and yours both use "article" as the name of the root element?); the public identifier (or the system identifier if it is an absolute URI) can uniquely identify the entity containing the external DTD subset but not the document type itself. Both namespaces and architectural forms provide the means for uniquely identifying the types of at least parts of a document (specific element and attribute types for namespaces, specific architectural views for AFs), but why should a client have to go to all that trouble? Isn't it easier to identify the resource type externally so that it can be handed directly to the correct processor? All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jmg at trivida.com Wed Feb 24 19:54:46 1999 From: jmg at trivida.com (Jeff Greif) Date: Mon Jun 7 17:09:25 2004 Subject: Question about New XHTML WD Message-ID: <041501be602f$34f84d90$a24630d1@greif.trivida.com> The minimal XHTML document used as an example in the WD is missing the the prolog PI Is this somehow accounted for by specifying the MIME type as text/xhtml? Or is this an oversight? Jeff -----Original Message----- >> > > >> > > >> > > The new working draft for XHTML is available at >> > > >> > > http://www.w3.org/TR/1999/WD-html-in-xml-19990224/ >> > > >> > > Frank >> > > Frank Boumphrey >> > > >> > >> > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Wed Feb 24 19:58:07 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:09:25 2004 Subject: Documents and Document Fragments Message-ID: > -----Original Message----- > From: Tim.Shaw@wdr.com [SMTP:Tim.Shaw@wdr.com] I maybe don't quite understand this - my apologies if I'm missing something. > A document fragment is a lightweight document (as defined by W3C > REC-DOM-Level-1-19981001). These fragments may be used for numerous > purposes - > including creating (by 'insertion') other documents. > > A document fragment need not be WF - but (presumably) they must > represent at > least one type construct as they are also Nodes > > In terms of the DOM-view of fragments, it doesn't even have to have because the fragment could just be text. It must be a well-formed entity, although it needn't be a well-formed document (like if it's just text.) In terms of an external view of fragments - say TR/NOTE-XML-FRAG-REQ - then conformance is to XML 1.0 [43] - which also could be just CharData, and again, the fragment must be a well-formed entity, but not necessarily a document. I separate the two views though, because I feel the terms represent two different concepts, in their separate contexts, as I'll try to explain. The discussion of 'document fragments' in say TR/NOTE-XML-FRAG-REQ talks about how to provide the context for a fragment. Most of the discussion on this list so far has talked about fragments as things like: Here is the news As I said in a previous message, I can see value in this definition once you are inside the parser - or from the point of view of the DOM - because it means you can refer to subsets of the document, move things around, and so on. But inside the DOM you have context; the fragment has parent information, you have the DTD to which it conforms, and so on. The fragment never exists 'independently' of its owning document. But what if that fragment wants to venture out into the big wide world of the Internet. Your news item wants to appear in my list of stories of the day. Then in implementation, I think you have to end up back at the notion of a document. I can't see how you could pass around a fragment between applications without something like: http://www.iedigital.net http://www.iedigital.net/newsfeed.dtd Here is the news OK, I made all this up, but what I'm showing is how the context in which 'Here is the news' appears on the news server has been passed to the receiving application. Now I can have a DOM for my news page, request your news item, create a DocumentFragment as defined in DOM from what I receive, and then insert it into my news document. > Surely it's down to the parser as to whether you can access these > things before > the XML document has been fully parsed (and they are _not_ Valid XML > 1.0 > documents(?)). The parser can still be XML 1.0 conformant - but it > would need to > provide non-conformant interfaces to allow interim access. > > I think this is talking about different things. Of course if you write a parser you can do anything you want with the data, but what I'm banging on about is whether you are keeping within the principles of XML when you start to provide 'non-conformant' interfaces, 'interim access' and such like. It seems to me that the notion of a document that has been confirmed to be well-formed - and possibly validated - is a fundamental unit of what we call XML. Beyond that, applications, features, enhancements, etc., should be considering implementation on top of that unit, not smashing it. I just can't get with the notion of something that's 'XML 1.0 conformant' that can process non-well-formed documents. So, to summarise; I think 'fragments' is a useful term when inside the parser, because there we can deal with real fragments. However, we cannot pass a parser anything less than an XML document (without going against XML 1.0) and so although outside the parser a 'fragment' is a useful abstraction, in reality one cannot exist outside a well-formed document. Yours, probably a bit pedantically, Mark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Feb 24 20:06:37 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:25 2004 Subject: Multiple namespaces for HTML... why? Message-ID: <3.0.32.19990224120504.00bc77d0@pop.intergate.bc.ca> At 01:08 PM 2/24/99 -0500, Frank Boumphrey wrote: >The new working draft for XHTML is available at > >http://www.w3.org/TR/1999/WD-html-in-xml-19990224/ What I find weird is having three namespaces for HTML. The costs of this are obvious - every programmer who wants to, for example, process HTML or
elements is going to have to check each one against three namespaces? I assume there must be some benefits to having 3 namespaces for HTML, but they're not self-apparent. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Feb 24 20:09:51 1999 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 17:09:25 2004 Subject: MIME types vs. DOCTYPE (was RE: ANNOUNCE: New XHTML WD) Message-ID: <3.0.32.19990224120838.00b8e8b0@pop.intergate.bc.ca> At 02:49 PM 2/24/99 -0500, David Megginson wrote: >Jonathan Borden writes: > > > excellent. one point: there is no reason to define text/xhtml as > > opposed to using text/xml and inserting a DOCTYPE or perhaps a > > default xmlns definition. If a user-agent needs to know the DOCTYPE > > ... look at it! > >Unfortunately, that doesn't work at all -- all DOCTYPE gives me is the >name of the root element, optionally accompanied by an internal DTD >subset and identifiers for an external DTD subset. Right; as many have pointed out, in both SGML and XML the The name of the root element is locally-scoped to the document itself, Yes, but what if it wasn't? It just dawned on me that if you had *two* header parameters for text/html, one being the namespace URI of the root, the other being its type, that would really give you a lot of help in identifying what kind of thing this is. E.g., if the namespace URI is http://www.w3.org/html40 (or whatever they decide to use) and the root type is , well, you know pretty well what you're dealing with. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From pgrosso at arbortext.com Wed Feb 24 20:35:02 1999 From: pgrosso at arbortext.com (Paul Grosso) Date: Mon Jun 7 17:09:25 2004 Subject: Documents and Document Fragments Message-ID: <3.0.32.19990224143412.00cd5868@pophost.arbortext.com> At 11:17 1999 02 24 -0800, Lauren Wood wrote: >Tim.Shaw@wdr.com wrote: > >> A document fragment is a lightweight document (as defined by W3C >> REC-DOM-Level-1-19981001). These fragments may be used for numerous purposes - >> including creating (by 'insertion') other documents. >> >> A document fragment need not be WF - but (presumably) they must represent at >> least one type construct as they are also Nodes > >No; the DOM spec defines its document fragment as having the same >potential content as a parsed entity, i.e. some mixture of text, >elements, comments, PIs, etc. It does not need to have a root >element, or indeed any elements at all. It could just consist of one >comment, or some text. Just a point of clarification: the DOM defines a DOM document object called DocumentFragment [1], it does not really give a meaning to the general phrase "document fragment" outside the world of DOM objects, structures, and APIs. The XML Fragment WG is doing something in this area. See the XML Activity Statement [2] for a brief mention, the WG's Requirements Document [3], the WG Group page [4] if you are a W3C Member, and watch xml-dev for an announcement of a first public draft of a spec very soon now. paul [1] http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html#ID-B63ED1A3 [2] http://www.w3.org/XML/Activity.html#fragment-wg [3] http://www.w3.org/TR/NOTE-XML-FRAG-REQ [4] http://www.w3.org/XML/Group/Fragments.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eisen at pobox.com Wed Feb 24 20:36:03 1999 From: eisen at pobox.com (Jonathan Eisenzopf) Date: Mon Jun 7 17:09:25 2004 Subject: having to deal with mal-formed XML References: Message-ID: <36D462FE.E905140F@pobox.com> Chris Weikart wrote: > Well of course you're both right. And being right is important for the > evolution of XML. But it's useless for me and what I need to do, here and > now, in the short-term, in which I must operate. > > My time frame completely precludes berating the publishers of bad XML. So I > am pursuing an imperfect, engineering solution to the problem. The current > bag of fixes has a negligible impact on my performance, since I'm > bottlenecked on other, far more expensive processes. And it covers 100% of > the errors I've found so far, in 3K URLs, so I'd guess it'll cover over 90% > of what I eventually find - in the ASX subset of XML. Therefore, as an > engineering solution, it's a very good one. > > I should mention, btw, that I could find no Microsoft advertising to the > effect that ASX V3 is XML. I looked at it and decided that they based it on > XML. Most of it parses as XML. Ultimately my use of fixups and XML::Parser > is far better than writing a specialised ASX parser because (a) it produces > quite acceptable results for (b) far less effort. > > I could rant and rave about Microsoft (believe me, I do ;-), and tell you > more about the engineering tradeoffs I'm balancing. But ultimately, I seem > to have posted to the wrong list. Thanks for your responses, and sorry to > have wasted your time! > This is an interesting and important thread. Most of us on this list know the basic XML rules, that they are strict, and must be enforced. On the other hand, the Desparate Perl (or XML) Hacker will have to deal with mal-formed or XML-like formats that they have no control over. So what to do?: 1. write a script to make it well-formed XML 2. write a non-XML parser 3. send the content author a nasty-gram telling him where he can stick his cruddy-malformed-nonXML-to-impress-the-boss-format-shoulda-RTFM(past tense) While most of us prefer option #3, doing so would probably not win us a customer relations prize. My recommendation would be #1 if it's possible. #2 is ok if the format is simple and it would take less work than #1. In summary, Chrisyoudidtherightthingdon'thateusbecausewe'reanalretentive BTW, I recently ran into this problem and took option #2. It's not optimal, but I had to get the job done, like Chris. I put it into an article at: http://www.webreference.com/perl. Looking back, I probably would have done it differently, but hey, it's working. Fortunately, since I wrote the app that generates the crappy XML last Spring, and the client and article that parses the crappy XML last summer, I will soon have the opportunity to fix the crappy XML and realign the rift in the space-time continuum. Jonathan. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tomh at thinlink.com Wed Feb 24 20:43:05 1999 From: tomh at thinlink.com (Tom Harding) Date: Mon Jun 7 17:09:25 2004 Subject: Streaming XML and SAX References: <4.0.1.19990223210727.00e59d50@pop.hesketh.net> <14036.1186.399749.89131@localhost.localdomain> Message-ID: <36D46419.73F63780@thinlink.com> David Megginson wrote: > 1. Use a non-XML mechanism for separating XML packets -- that way, > there's not a tight dependency between the stream-handler and the > parser (the stream handler knows the bounds of each packet without > doing any XML parsing). What bit sequence would you use as a separator and how would you ensure that no conceivable encoding would produce it spuriously? > 2. Separate information about the packages from the packets > themselves. The information could be linear, or it could itself be > XML packets of a different sort. You should not have to parse an > entire packet to know its sequencing, etc. How could you terminate a document with another doc element? The only thing allowed after all legitimate Misc markup at the end of a document is more Misc markup. > Putting a PI in the XML packet itself seems a little awkward to me. How about thinking of it as a "network-ready" document? Or if you like, explicitly define a "packet" as such a document. Tom Harding xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From James.Anderson at mecomnet.de Wed Feb 24 20:48:57 1999 From: James.Anderson at mecomnet.de (james anderson) Date: Mon Jun 7 17:09:25 2004 Subject: Multiple namespaces for HTML... why? References: <3.0.32.19990224120504.00bc77d0@pop.intergate.bc.ca> Message-ID: <36D4670A.C13FC239@mecomnet.de> there were some notes on this topic back several weeks ago. my reading was that, for certain element types - those which would otherwise share a name, but are declared with different content models, the desire was to distinguish the names by virtue of their namespace. those elements which do have the same model are, on the other hand, an argument against this. from some viewpoints, these latter are an argument for the ability to inherit names among namespaces, but that's another story. the same issue may reappear once folks start dealing with versioning. Tim Bray wrote: > > At 01:08 PM 2/24/99 -0500, Frank Boumphrey wrote: > >The new working draft for XHTML is available at > > > >http://www.w3.org/TR/1999/WD-html-in-xml-19990224/ > > What I find weird is having three namespaces for HTML. The costs of > this are obvious - every programmer who wants to, for example, process > HTML or elements is going to have to check each one against > three namespaces? > > I assume there must be some benefits to having 3 namespaces for > HTML, but they're not self-apparent. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tomh at thinlink.com Wed Feb 24 20:52:37 1999 From: tomh at thinlink.com (Tom Harding) Date: Mon Jun 7 17:09:25 2004 Subject: Streams, protocols, documents and fragments References: Message-ID: <36D46640.94081620@thinlink.com> Mark Birbeck wrote: > You know when you've reached the end > by the closing tag. That's it! There's the pesky issue of the document "epilog" which is Misc markup that can follow the closing tag. If the XML Rec were changed to disallow this then I would be in complete agreement. Tom Harding xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clark.evans at manhattanproject.com Wed Feb 24 20:53:50 1999 From: clark.evans at manhattanproject.com (Clark Evans) Date: Mon Jun 7 17:09:25 2004 Subject: ANNOUNCE: New XHTML WD References: <002d01be602c$a7714b40$2eaddccf@ix.netcom.com> Message-ID: <36D46590.F23911AD@manhattanproject.com> The problem with using a specific text/xhtml mime is that it sets a dangerous precedent, and could lead to a potential explosion of mime types. However some kind of negotiation must be performed. If the mime type was something like: 'text/xml.html' so that the '.html' represented the xml dtd, I feel the precedent would not be _that_ bad. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Wed Feb 24 21:18:28 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:26 2004 Subject: ANNOUNCE: New XHTML WD In-Reply-To: <36D46590.F23911AD@manhattanproject.com> References: <002d01be602c$a7714b40$2eaddccf@ix.netcom.com> Message-ID: <199902242118.QAA06218@hesketh.net> At 08:48 PM 2/24/99 +0000, Clark Evans wrote: > > > > > The problem with using a specific text/xhtml mime is that it sets a > dangerous precedent, and could lead to a potential explosion of mime types. > However some kind of negotiation must be performed. > > > > > > If the mime type was something like: 'text/xml.html' so that the > '.html' represented the xml dtd, I feel the precedent would > not be _that_ bad. > > Er... too bad folks weren't more enthusiastic about xml/whatever. xml/html would've gotten us out of this one nicely. Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Feb 24 21:35:57 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:26 2004 Subject: Streaming XML and SAX In-Reply-To: <36D46419.73F63780@thinlink.com> References: <4.0.1.19990223210727.00e59d50@pop.hesketh.net> <14036.1186.399749.89131@localhost.localdomain> <36D46419.73F63780@thinlink.com> Message-ID: <14036.28216.379328.364771@localhost.localdomain> Tom Harding writes: > David Megginson wrote: > > > 1. Use a non-XML mechanism for separating XML packets -- that > > way, there's not a tight dependency between the stream-handler > > and the parser (the stream handler knows the bounds of each > > packet without doing any XML parsing). > > What bit sequence would you use as a separator and how would you > ensure that no conceivable encoding would produce it spuriously? I'm talking about characters, not bit sequences. For a simple solution, you should provide the entire stream in the same character encoding (remember that a transport protocol is allowed to override the encoding in the XML declaration or encoding declaration). Otherwise, the packets will need to be escaped somehow. > > 2. Separate information about the packages from the packets > > themselves. The information could be linear, or it could > > itself be XML packets of a different sort. You should not > > have to parse an entire packet to know its sequencing, etc. > > How could you terminate a document with another doc element? The > only thing allowed after all legitimate Misc markup at the end of a > document is more Misc markup. But you're not performing XML parsing at all until you take the stream apart first -- in other words, all the XML parser sees is the part between the separator characters. This is the kind of layered approach that makes for simple, maintainable systems. > > Putting a PI in the XML packet itself seems a little awkward to me. > > How about thinking of it as a "network-ready" document? Or if you > like, explicitly define a "packet" as such a document. No, it still looks like a messy architecture to me, because the transport layer has to know about the packets -- it has to parse the XML about to get information about what it's looking at, and that adds complexity and inefficiency. A clean architecture should separate the layers completely, and use XML only where it has an obvious advantage over other approaches. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Wed Feb 24 21:38:00 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:09:26 2004 Subject: Streams, protocols, documents and fragments was: RE: Document s and Document Fragments (Was RE: XML Information Set Requirements, W3C N ote 18-February-1999) Message-ID: James Tauber wrote: > Mark, what are your views on the W3C's activity on XML > Fragments? Do I infer > correctly that you disagree with the need? Not at all James - I just think that different things become issues at different levels. I believe that everything we need to get (logical) documents out of a database, or from a file, to a parser, where - as you rightly corrected my previous post - that document becomes a 'physical' document, is already in XML 1.0. i.e., file -> parser, or database -> stream -> parser, or whatever other things people come up with. BUT, once you're 'inside' the parser, other issues arise - linking, filtering, fragments, etc. I think it's important to keep the layers separate to ensure the purity of XML as effectively an 'interface'. Mark PS Sorry for the delay. This has been sitting here for a day - I didn't press SEND! xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Feb 24 21:39:14 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:26 2004 Subject: Streams, protocols, documents and fragments In-Reply-To: <36D46640.94081620@thinlink.com> References: <36D46640.94081620@thinlink.com> Message-ID: <14036.28838.719355.44002@localhost.localdomain> Tom Harding writes: > Mark Birbeck wrote: > > > You know when you've reached the end by the closing tag. That's > > it! > > There's the pesky issue of the document "epilog" which is Misc > markup that can follow the closing tag. If the XML Rec were > changed to disallow this then I would be in complete agreement. More importantly, you don't want to have to parse an entire document just to find out where it ends because that forces your system into linear processing -- on a busy server, it is absolutely necessary to be to isolate the documents/packets quickly and pass them off to separate threads (or even separate boxes) for parsing and processing. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Wed Feb 24 21:40:23 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:26 2004 Subject: ANNOUNCE: New XHTML WD In-Reply-To: <199902242118.QAA06218@hesketh.net> References: <002d01be602c$a7714b40$2eaddccf@ix.netcom.com> <36D46590.F23911AD@manhattanproject.com> <199902242118.QAA06218@hesketh.net> Message-ID: <14036.28972.319447.333273@localhost.localdomain> Simon St.Laurent writes: > Er... too bad folks weren't more enthusiastic about xml/whatever. > xml/html would've gotten us out of this one nicely. At least until we started getting further variants of things: xml/rdf/webstuff xml/rdf/ecommerce etc. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wunder at infoseek.com Wed Feb 24 21:52:05 1999 From: wunder at infoseek.com (Walter Underwood) Date: Mon Jun 7 17:09:26 2004 Subject: Streaming XML and SAX In-Reply-To: <36D46419.73F63780@thinlink.com> References: <4.0.1.19990223210727.00e59d50@pop.hesketh.net> <14036.1186.399749.89131@localhost.localdomain> Message-ID: <3.0.5.32.19990224134311.00b8d1a0@corp> At 12:42 PM 2/24/99 -0800, Tom Harding wrote: >David Megginson wrote: > >> 1. Use a non-XML mechanism for separating XML packets -- that way, >> there's not a tight dependency between the stream-handler and the >> parser (the stream handler knows the bounds of each packet without >> doing any XML parsing). > >What bit sequence would you use as a separator and how would you ensure >that no conceivable encoding would produce it spuriously? This may be going out on a limb, but what about: 200 OK HTTP/1.1 Content-type: text/xml Content-length: ... Nah, it'll never work. wunder -- Walter R. Underwood wunder@infoseek.com wunder@best.com (home) http://software.infoseek.com/cce/ (my product) http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Wed Feb 24 21:56:37 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:26 2004 Subject: Question about New XHTML WD References: <041501be602f$34f84d90$a24630d1@greif.trivida.com> Message-ID: <36D474ED.9A772ACF@locke.ccil.org> Jeff Greif wrote: > The minimal XHTML document used as an example in the WD is missing the the > prolog PI > Actually, no. The XML declaration (it's not really a PI, though it looks rather like one) is not required unless the encoding is other than UTF-8 or UTF-16. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tomh at thinlink.com Wed Feb 24 22:16:12 1999 From: tomh at thinlink.com (Tom Harding) Date: Mon Jun 7 17:09:26 2004 Subject: Streams, protocols, documents and fragments References: <36D46640.94081620@thinlink.com> <14036.28838.719355.44002@localhost.localdomain> Message-ID: <36D479F1.28D796D9@thinlink.com> David Megginson wrote: > More importantly, you don't want to have to parse an entire document > just to find out where it ends because that forces your system into > linear processing -- on a busy server, it is absolutely necessary to > be to isolate the documents/packets quickly and pass them off to > separate threads (or even separate boxes) for parsing and processing. Good point. I have been implicitly assuming this as a cost of moving the parsing function into the network infrastructure. However, a general-purpose endpoint implementation would have a hard time parallelizing in the way you describe because of possible inter-document dependencies in the application protocol. It has to deliver the documents to the next layer in the order in which they were sent. If parallelism is explicitly needed then a solution is to create multiple connections. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Mark.Birbeck at iedigital.net Wed Feb 24 22:27:46 1999 From: Mark.Birbeck at iedigital.net (Mark Birbeck) Date: Mon Jun 7 17:09:26 2004 Subject: Streams, protocols, documents and fragments Message-ID: David Megginson wrote: > Tom Harding writes: > > > Mark Birbeck wrote: > > > > > You know when you've reached the end by the closing tag. That's > > > it! > > > > There's the pesky issue of the document "epilog" which is Misc > > markup that can follow the closing tag. If the XML Rec were > > changed to disallow this then I would be in complete agreement. But you do know that the *document* has ended. You may receive PIs, but aren't they directed at applications? If so, then if you are not expecting any why not ignore them? (And comments.) Then all you need to wait for is the next '' or More importantly, you don't want to have to parse an entire document > just to find out where it ends because that forces your system into > linear processing -- on a busy server, it is absolutely necessary to > be to isolate the documents/packets quickly and pass them off to > separate threads (or even separate boxes) for parsing and processing. You may be able to *parse* parts of the document before the entire document has completely arrived, but it is surely wrong to *process* the document because you don't yet know if it's well-formed or valid. Some of the parsing you did in one process might be invalidated as the result of another process. Regards, Mark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tgeer at sunset.net Wed Feb 24 22:45:01 1999 From: tgeer at sunset.net (Thomas Geer) Date: Mon Jun 7 17:09:26 2004 Subject: Streams, protocols, documents and fragments References: <36D46640.94081620@thinlink.com> <14036.28838.719355.44002@localhost.localdomain> <36D479F1.28D796D9@thinlink.com> Message-ID: <36D480B1.7EFDAD9F@sunset.net> I have been watching this debate for a while and thought I would interject. OMG has done some significant work on the meta data frontier for XML, with XMI and its support for MOF and UML. The point is the real need here for not only schema and syntax definition but the ability to allocate a standard subset of processing instructions via a RDF(?). The Warwick framework and associated research is to document centric and I believe allot of the parsers etc. for the future will be Java based so the natural binding to a platform independent directory interface (JNDI centric) seems somewhat logical. I just thought as I notice a number of the more prominent XML authors on the group here we'd begin to acknowledge the real need for controllable meta information. Tom Harding wrote: > David Megginson wrote: > > > More importantly, you don't want to have to parse an entire document > > just to find out where it ends because that forces your system into > > linear processing -- on a busy server, it is absolutely necessary to > > be to isolate the documents/packets quickly and pass them off to > > separate threads (or even separate boxes) for parsing and processing. > > Good point. I have been implicitly assuming this as a cost of moving the parsing function into > the network infrastructure. However, a general-purpose endpoint implementation would have a > hard time parallelizing in the way you describe because of possible inter-document dependencies > in the application protocol. It has to deliver the documents to the next layer in the order in > which they were sent. If parallelism is explicitly needed then a solution is to create > multiple connections. > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From srn at techno.com Wed Feb 24 22:51:24 1999 From: srn at techno.com (Steven R. Newcomb) Date: Mon Jun 7 17:09:26 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) In-Reply-To: <000401be6015$6a2a2840$d3228018@jabr.ne.mediaone.net> (jborden@mediaone.net) References: <000401be6015$6a2a2840$d3228018@jabr.ne.mediaone.net> Message-ID: <199902242250.QAA03538@bruno.techno.com> [Jonathan Borden:] > ... this is basic stuff, but the point is to > emphasize that the distinction between what an > object 'does' and what an object 'is' is not so > clearcut. Actually, property sets make it very clearcut. Remember that property sets are not implementation descriptions, whereas UML models are. In property sets there are never any methods whatsoever. This point is emphasized by the fact that, in the grove paradigm, the information components are called "nodes" rather than "objects". If you choose to instantiate a grove as a collection of objects (as many reasonable people, including those at my own company, certainly would), that's OK, but the fundamental abstraction does not have the concept of methods. [Much good stuff from Jonathan Borden omitted, with all points taken.] > In fact, when you get out of the SGML/XML world, > the use of the terms 'property set' and 'grove' > get replaced by terms 'UML', 'persistence' and > 'object model'. What you promise that use of > property sets and grove plans will automate > processing of data and interoperability, CASE > tools vendors promise using UML. What is the > essence of the difference between an information > set and/or property set and/or grove plan versus > UML? I was hoping you would ask this question! Let me begin by oversimplifying: the difference is that you can do much more with UML, and that oversufficiency is precisely UML's deficiency in this problem-space. It is very difficult for people who have made their careers in *information processing* to perceive the virtue of making a complete distinction between processing and information. Even so, it's of paramount importance to make this distinction, if any of the following statements are true: * the information may outlast existing processing systems, * the information may have unforeseen uses in an ever-changing world, and * the information must be interchanged in an open, multivendor environment. Instead of encapsulating such information in methods, as objects often do, we need to encapsulate it in semantics, as XML can be used to do. Having rendered the information as XML, and having chosen appropriate semantic-bearing tags and other attributes for its various components, we now have the information in a totally useless but highly interchangeable form that can become input to any application for any purpose, including unforeseen purposes. For me, this useless but interchangeable XML form of the information is the form that is most deserving of its owner's respect. It is the owner's best choice of representation as the "maintained source code" of the information asset. It's the form that nobody but the information owner owns or controls. It's the form that no software vendor has a lock on. It's the form that (presumably) has everything needed to reconstitute a useful, application-ready form of the same information asset, regardless of the nature of that application, foreseen or unforeseen. Now let's consider how well-described this XML asset really is. After all, if the asset doesn't have a very accurate description, we can't be sure that unforeseen applications will find the information intelligible. With DTDs, we have a way to model the structural relationships of the elements to each other. But that's not enough to guarantee that the information will be understood in the manner that its architects and creators intended. With various proposed XML schema languages, we can impose lexical typing requirements and certain additional syntactic/structural requirements, but, again, that doesn't guarantee that the information will be understood in the manner that was intended. Neither the DTD nor the schema extensions so far proposed can tell us the information set that is supposed to be derivable from the XML form of the information asset. The information is still not described well enough to allow unforeseen applications, developed by unforeseeable developers, to use the information or to create new but similar information. All of the generic structural/syntactic validation in the world will not guarantee that! This is because the interchangeable form of the information is not the same as the useful form, which we will assume, for purposes of this discussion, is objects that conform to certain classes and have certain constellations of properties and relationships. Now the question becomes, "What defines the data, interrelationships, and semantics of those objects?" The ISO/SGML answer is, "A property set, designed as part of the interchange architecture, that defines the classes of objects that will reflect the quintessential information set conveyed by the resource." The object classes defined by a property set, and the node-objects in the groves that conform to those classes, are strictly the canonical, static *result* of the processing that is explicitly (but only conceptually) *required* to be done to all resources that conform to the architecture, before they are used by an application. Conceptually speaking, these "groves" fully respect the characteristics of the interchangeable resource that they represent, including the fact that an interchangeable resource has no methods, and there is nothing dynamic (or even useful) about it when it's in its XML form. A property set is an abstract model of the useful information that can be extracted from an interchangeable resource. There is nothing in a grove that isn't already in the corresponding resource. Property sets are designed to exactly reflect the characteristics of information that can be extracted from information resources. An intelligent person like yourself may remark, "Well, then, I guess the abstract properties of C++ notation must be very complex, because they can describe arbitrarily complex processes." You're right, they are, and the abstract properties of C++ notation can be modeled using the property set paradigm. (And modeling C++ notation would be an interesting exercise, although I'm not yet confident of commercial interest.) A property set for C++ notation might include node classes with such names as "variable name", "passed argument", "operator", "method", "object", "class definition", etc. So why bother with property sets, when UML is more powerful? * Because property sets impose the design discipline of focusing on what is being interchanged, rather than on what might be done by particular applications. They force you to focus on the precise nature of the "maintained source code" of the information. They force you to think more abstractly, which can be uncomfortable but is often very worthwhile. They force you to recognize that interchangeable information cannot modify itself, and has no built-in methods. * Because property sets are designed to support the addressing of arbitrary components of information, and their nature imposes the discipline of designing for various forms of addressing. Everything that is modeled in a property set can become a node in a grove, and everything that can become a node in a grove is predictably and reproducibly addressable. This means that addresses created and recorded by one application will be understandable and correctly resolvable by other applications. This is the key to the solution of the general hyperlinking problem. If, for example, we're addressing some node by counting other nodes, all of the counted nodes must exist, at least conceptually. > Don't get me wrong, I think the work on > information sets, property sets and groves is > terrific and needs to be continued. One way to > do this is to turn our heads sideways ever so > often to see what collegues in the distributed > object world are doing. These problems are > universal. Very true. But information interchange is a funny thing. XML does not proceed from the study of computer programming. It comes from another direction, and it's a different problem space. Portable-software-ology is a specialized subdomain of, and not the same thing as, portable-information-ology. (I sure wouldn't want to try to support portable information without portable software, though!) At the risk of confusing the reader, let me add that the property set syntax is just one syntax for doing what property sets do, albeit the ISO standard one for doing it. The claim has been made by Eliot Kimber that the STEP schema language, EXPRESS, would do as well or better. I think he's probably right. EXPRESS, however, is a more powerful language that is more demanding to learn. By contrast, the property set syntax is defined as an SGML or XML DTD, and a small and simple one at that. -Steve -- Steven R. Newcomb, President, TechnoTeacher, Inc. srn@techno.com http://www.techno.com ftp.techno.com voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137) fax +1 972 994 0087 (at ISOGEN: +1 214 953 3152) 3615 Tanner Lane Richardson, Texas 75082-2618 USA xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tallen at sonic.net Wed Feb 24 23:36:58 1999 From: tallen at sonic.net (Terry Allen) Date: Mon Jun 7 17:09:26 2004 Subject: RE UML vs groves (Was Streaming XML) Message-ID: <199902242336.PAA19021@bolt.sonic.net> Steve Newcomb wrote (inter alia): | | Actually, property sets make it very clearcut. | Remember that property sets are not implementation | descriptions, whereas UML models are. I've been nibbling at UML for the past six months, and it came up at last week's Open Forum on Metadata Registries (the Open Forum was about to ISO 11179, which I think OASIS may want to use for it's Registry and Repository activity - that's why I was there). I agree that UML models are often implementation descriptions, but it's not obvious to me that they always are. For example, ANSI X3.285, which is proposed to replace the present Part 3 of ISO 11179 (which contains most of the semantics relevant to modelling a data dictionary) seems entirely abstract. They even call it a metamodel. And it's illustrated with UML diagrams. Am I missing something? (I am not addressing and perhaps not even interested in the main topic of the discussion, just in being sure I understand UML, XMI, and the MOF.) regards, Terry Terry Allen Commerce One, Inc. Business Language Designer 1600 Riviera Ave., Suite 200 Advanced Technology Group Walnut Creek, Calif., 94596 tallen[at]sonic.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ken at bitsko.slc.ut.us Thu Feb 25 00:06:13 1999 From: ken at bitsko.slc.ut.us (Ken MacLeod) Date: Mon Jun 7 17:09:26 2004 Subject: Streams, protocols, documents and fragments In-Reply-To: "Matthew Sergeant's message of Wed, 24 Feb 1999 18:28:01 +0100 References: <5F052F2A01FBD11184F00008C7A4A800022A164D@eukbant101.ericsson.se> Message-ID: "Matthew Sergeant (EML)" writes: > Perl's XML::Parser (and by logic I assume this comes from expat) > already does this - you can attach an IO handle (which can be an > IO::Socket if you want it to come from a socket) to the XML parser, > tell it what the stream delimiter is (e.g. ^L) and leave it merrily > parsing XML. You have to tell it to parse again when it stops, > because that's how it says "I've reached the end of one stream" - > but it works fine. This is probably true of other languages that use > expat to parse XML for them. I had thought that was in expat too but looking deeper I found that it's being done by XML::Parser and not expat itself. -- Ken MacLeod ken@bitsko.slc.ut.us xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ken at bitsko.slc.ut.us Thu Feb 25 00:19:09 1999 From: ken at bitsko.slc.ut.us (Ken MacLeod) Date: Mon Jun 7 17:09:26 2004 Subject: MIME types vs. DOCTYPE (was RE: ANNOUNCE: New XHTML WD) In-Reply-To: Tim Bray's message of Wed, 24 Feb 1999 12:08:43 -0800 References: <3.0.32.19990224120838.00b8e8b0@pop.intergate.bc.ca> Message-ID: Tim Bray writes: > At 02:49 PM 2/24/99 -0500, David Megginson wrote: > >Jonathan Borden writes: > > > > > excellent. one point: there is no reason to define text/xhtml as > > > opposed to using text/xml and inserting a DOCTYPE or perhaps a > > > default xmlns definition. If a user-agent needs to know the > > > DOCTYPE ... look at it! > >The name of the root element is locally-scoped to the document itself, > > Yes, but what if it wasn't? It just dawned on me that if you had > *two* header parameters for text/html, one being the namespace URI of > the root, the other being its type, that would really give you a lot > of help in identifying what kind of thing this is. > > E.g., if the namespace URI is http://www.w3.org/html40 (or whatever > they decide to use) and the root type is , well, you know > pretty well what you're dealing with. You may have meant this, but it wasn't blindingly obvious to me :-), do you mean something like: Content-Type: text/html; namespace-uri="http://www.w3.org/html40" root-type=html Content-Type: text/xml; public-id="[public-id]" root-type="[root-type]" -- Ken MacLeod ken@bitsko.slc.ut.us xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Thu Feb 25 00:55:18 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:27 2004 Subject: MIME types vs. DOCTYPE (was RE: ANNOUNCE: New XHTML WD) Message-ID: <004b01be6058$c72609e0$c9a8a8c0@thing2> From: Tim Bray >At 02:49 PM 2/24/99 -0500, David Megginson wrote: >>The name of the root element is locally-scoped to the document itself, > >Yes, but what if it wasn't? It just dawned on me that if you had >*two* header parameters for text/html, one being the namespace URI of >the root, the other being its type, that would really give you a lot >of help in identifying what kind of thing this is. > >E.g., if the namespace URI is http://www.w3.org/html40 (or whatever >they decide to use) and the root type is , well, you know >pretty well what you're dealing with. This is exactly the direction we're moving with MDSAX. Events prior to the root element (PIs) are queued. The root element then dictates how the document will be processed. All we really need to do here is extend the documentRouter to be namespace aware. This is work in progress, and it seems reasonable to say that you can expect to see it in the MDSAX1.0 production release. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Thu Feb 25 01:04:31 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:09:27 2004 Subject: MIME types vs. DOCTYPE (was RE: ANNOUNCE: New XHTML WD) In-Reply-To: <14036.21688.228887.515457@localhost.localdomain> Message-ID: <000901be605a$10281ce0$d3228018@jabr.ne.mediaone.net> David Megginson wrote: > > > Isn't it easier to identify the resource type externally so > that it can be handed directly to the correct processor? > > Assuming that HTML is defined in XML, then isn't the correct processor the XML processor? text/xml correctly identifies the content-type. If you make an exception for the specific XHTML DTD then why not for every DTD! The argument that text/xhtml for content negotiation is a shaky one because the problem of content negotiation is a well known problem for HTTP. Proposed solutions include RFC 2295. A better solution is to employ specific request/response headers e.g. Content-Type: text/xml Content-Document-Type: http://www.w3.org/html50.dtd or, Content-Type: text/xml; document-type=http://www.w3.org/html50.dtd; charset=us The problem with content-type proliferation is that lots of software depends on known content-types. For example, how can you programmatically tell if a MIME message body contains XML? Parse it and if it succeeds then TRUE? Its alot easier to add a new header recognized by new UAs than it is to modify legacy and currently working code. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Marc.McDonald at Design-Intelligence.com Thu Feb 25 01:10:08 1999 From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com) Date: Mon Jun 7 17:09:27 2004 Subject: Streams, protocols, documents and fragments Message-ID: Instead of separation characters, I would just label the fragments (borrowed from XML linking): MSFT 1000 ICI 1010 The xf:fragment element identifies each fragment in terms of its location in the entire document. In this case it assumes a document structured as ..* where the prices desired are the 1700th and 1327th price entries. It may be a bit verbose, but allows a document tree to be transmitted piecemeal and reassembled. Access to any node not yet downloaded could be requested (for instance if stockPrice 1553 were desired). The reassembled tree caches the subset of elements that an application is interested in, but has all of the holes to access additional elements. Internet protocols are supposed to ensure an error free transmission. Any ordering problems are resolved by the location description. Marc B McDonald Principal Software Scientist Design Intelligence, Inc www.design-intelligence.com ---------- From: Mark Birbeck [SMTP:Mark.Birbeck@iedigital.net] Sent: Wednesday, February 24, 1999 7:18 AM To: xml-dev list Subject: RE: Streams, protocols, documents and fragments > From: Borden, Jonathan [SMTP:jborden@mediaone.net] > My sole purpose in discussing 'document > fragments' was because the thread had gotten stuck on the notion that > a > continuous XML stream would contain a single long document (perhaps > w/o a > closing tag) and the actual PDU's consist of document fragments ... > the > point is that if we create a protocol on a stream which transmitts > multiple > documents, there is no loss of functionality over a solution employing > 'document fragments' > I agree with this. And the point I was trying to get to was that therefore we don't need to introduce loads of terms on top of XML 1.0 to understand the concepts. I still think all of this is being over-complicated - but then maybe I'm the one who's missing something, so let's see. I don't follow why so many suggestions to resolving this problem involve stepping 'outside of' XML 1.0. We have suggestions for sync characters like ^C and ^L, we have the proposal that XML 1.0 should be fundamentally altered to allow the concept of a 'not well-formed' document (or one that may *become* well-formed at some point in the future), we have proposals for documents that contain subsets of validity. All of these suggestions seem to go against the grain of what XML is about. XML 1.0 already copes with streams and files. A physical XML document is a linear sequence of characters conforming to certain rules. You can't tell whether those rules have been met until you have received the entire sequence of characters. You know when you've reached the end by the closing tag. That's it! There's not much else you can do about it, because that's what XML is all about - well-formed, possibly validated documents conforming to certain rules. Now, the fact that the beginning and end of this sequence of characters may be presented to the parser eight hours apart is to me an application problem. If someone has a document that takes eight hours to arrive then maybe they should re-think how they're setting the system up. If it's a massive document that can only be processed in its entirety, and if any part fails to arrive the whole document fails, then sure, you have to go ahead and send it over eight hours. But the stock ticker example is not like this. If I miss the stock price for Microsoft at 11am, then I can still make use of the stock price for Microsoft at 11.20am. It will affect my historical archives, but at least I have something to display. It is not an 'all or nothing' situation. So, accepting for a moment that we should transmit many documents throughout the day, rather than one big one, it leaves the question of demarcation. And here I'm surprised that people want to step outside of XML to find a solution. Say we send the following: ^L MSFT 1000 ^L ICI 1010 ^L If the data link is 100% reliable then we have encoded redundant information because the document name - the element for stockPrice - already tells us where one starts and ends. So, we don't need the ^L. But if the data link *isn't* reliable then adding a few ^L characters doesn't help a lot, because if we lose the following sequence we have no way of knowing: 1000 ^L ICI If this sequence is taken out of the above two documents then you now have the wrong price for Microsoft and nothing for ICI, and your application is none the wiser. I think if 100% data reliability is required then we need a few streaming-related attributes that we can add to our documents, such as: MSFT 1000 ICI 1010 These would be added by a 'sending' application as a separate layer to the original document generation, and would allow the receiving application to process all the 'streamns' packets before actually processing the nodes - say, storing or displaying the stock prices. You could remove 'invalid' nodes from the tree (well-formed at the XML level, but with the wrong packet ID), and then while your main application is getting on and acting on the stock data, the receiving process could be re-requesting the lost data. In the illustration above, after losing the packet, we would now have: MSFT 1010 <--- error here and the 'streamns' processing would spot and re-request the missing data easily (both packet 55 and packet 56). To be honest, I'm not suggesting what I've said here as some new standard. There are lots of ways what I've described could be achieved, for example: MSFT 1000 ICI 1010 takes up less space, and would still spot the same errors. I'm just trying to illustrate how solutions can be found that don't involve smashing XML 1.0 to bits. At the end of the day this is an application problem, not an XML one. Regards, Mark xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jes at kuantech.com Thu Feb 25 01:17:18 1999 From: jes at kuantech.com (Jeffrey E. Sussna) Date: Mon Jun 7 17:09:27 2004 Subject: MIME types vs. DOCTYPE (was RE: ANNOUNCE: New XHTML WD) In-Reply-To: <000901be605a$10281ce0$d3228018@jabr.ne.mediaone.net> Message-ID: <001301be605c$3d3517e0$5118a8c0@kuantech1.quokka.com> Here here! Coudn't have said it any better myself. -----Original Message----- From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of Jonathan Borden Sent: Wednesday, February 24, 1999 4:59 PM To: David Megginson; xml mailing list Cc: www-html-editor@w3.org Subject: RE: MIME types vs. DOCTYPE (was RE: ANNOUNCE: New XHTML WD) David Megginson wrote: > > > Isn't it easier to identify the resource type externally so > that it can be handed directly to the correct processor? > > Assuming that HTML is defined in XML, then isn't the correct processor the XML processor? text/xml correctly identifies the content-type. If you make an exception for the specific XHTML DTD then why not for every DTD! The argument that text/xhtml for content negotiation is a shaky one because the problem of content negotiation is a well known problem for HTTP. Proposed solutions include RFC 2295. A better solution is to employ specific request/response headers e.g. Content-Type: text/xml Content-Document-Type: http://www.w3.org/html50.dtd or, Content-Type: text/xml; document-type=http://www.w3.org/html50.dtd; charset=us The problem with content-type proliferation is that lots of software depends on known content-types. For example, how can you programmatically tell if a MIME message body contains XML? Parse it and if it succeeds then TRUE? Its alot easier to add a new header recognized by new UAs than it is to modify legacy and currently working code. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Thu Feb 25 02:21:43 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:27 2004 Subject: ANNOUNCE: New XHTML WD In-Reply-To: <14036.28972.319447.333273@localhost.localdomain> References: <199902242118.QAA06218@hesketh.net> <002d01be602c$a7714b40$2eaddccf@ix.netcom.com> <36D46590.F23911AD@manhattanproject.com> <199902242118.QAA06218@hesketh.net> Message-ID: <199902250221.VAA11592@hesketh.net> At 04:38 PM 2/24/99 -0500, you wrote: >Simon St.Laurent writes: > > > Er... too bad folks weren't more enthusiastic about xml/whatever. > > xml/html would've gotten us out of this one nicely. > >At least until we started getting further variants of things: > > xml/rdf/webstuff > xml/rdf/ecommerce > >etc. That might well be a good idea too. I'd be much more willing to go text/xml/html and text/xml/rdf than just plain old text/html or text/rdf. Speaking of RDF, the RDF Model and Syntax Specification became a W3C recommendation today. Better get crackin'... Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From marcelo at mds.rmit.edu.au Thu Feb 25 04:41:50 1999 From: marcelo at mds.rmit.edu.au (Marcelo Cantos) Date: Mon Jun 7 17:09:27 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) In-Reply-To: ; from Mark Birbeck on Sun, Feb 21, 1999 at 02:37:38PM -0000 References: Message-ID: <19990222103913.C18930@io.mds.rmit.edu.au> On Sun, Feb 21, 1999 at 02:37:38PM -0000, Mark Birbeck wrote: > Couldn't find a coin - so I suppose I should respond: > > Marcelo Cantos wrote: > > On Sat, Feb 20, 1999 at 04:08:24PM -0000, Mark Birbeck wrote: > > >then we can't put anything on that > > > wire other than news headlines (and really you shouldn't process > > > anything until you receive that closing element, but I know > > > that's what > > > people are requesting they can do). > > > > I disagree with that last parenthesised remark. Stream-based parsers > > do and indeed should process data as it arrives. XML browsers _most > > certainly_ should do so. > > > > Not that I disagree with your overall point (I haven't really given it > > that much thought), but the above is definitely wrong IMO. > > You seem to have missed the point of the discussion. So say you. > The question is > whether it is legitimate to open a stream of XML with some sort of > element like: > > > > and then spend the rest of the day sending out things like: > > > MSFT > 1000 > > > and then at the end of the day, sending: > > > > No-one so far in the discussion has argued that this is good XML - > except you Marcelo, but you can be excused because you haven't given it > much thought - because if you were validating this you should not (CAN > NOT!) say the document 'stockPrices' is valid until you receive the > closing element. And that would mean you couldn't process the > intervening prices until you had validated the entire document, and that > would mean your data feed would be useless. Which part of "Not that I disagree with your overall point" eludes you, Mark? Did I not make it abundantly clear that I was merely pointing out small error in the body of your post, rather than attempting to dispute your thesis? For the record, I quite agree that wrapping things up in virtual documents is generally pointless. Hence your conclusion that only I consider the above to be good XML is unfounded because I don't. And so is your charge that I have missed the point of the discussion. Please read what I write, and don't jump up and down and get all defensive the instant someone tells you you're wrong! Speaking of which, _do_ you think XML browsers should refuse to display anything until the root element is closed? Cheers, Marcelo -- http://www.simdb.com/~marcelo/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eric at w3.org Thu Feb 25 07:15:54 1999 From: eric at w3.org (Eric Prud'hommeaux) Date: Mon Jun 7 17:09:27 2004 Subject: having to deal with mal-formed XML In-Reply-To: ; from Craig I. Johnson on Wed, Feb 24, 1999 at 11:14:14PM -0500 References: Message-ID: <19990225021522.A26059@w3.org> On Wed, Feb 24, 1999 at 11:14:14PM -0500, Craig I. Johnson wrote: > Someone from w3c should complain. Pursuant to this, I am charged with collecting a list of XML generating and using apps and evaluating their conformance with http://www.w3.org/TR/1998/REC-xml-19980210. The objective is not to bash any particular vendor, but to create a forum where vendors and the public may determine what XML applications need attention and what ones will sail comfortably and interroperably into the next millineum. Conformance may be judged on these criteria (any probably more that I haven't thought of): Generates well-formed XML. Generates valid XML where a DTD is available. Accepts well-formed/valid XML. Follows the application rules described in the XML spec: Produces a valid document tree when processing valid XML. Produces an empty doc when it observes a fatal error (see http://www.w3.org/TR/1998/REC-xml-19980210#sec-terminology). I would like to set up a validation service for generated XML and a test suite for apps that process XML. Please post any additional or refined criteria to and so that we may derive a fair and impartial basis for evaluating and furthering conformance. If the traffic becomes a burden on these lists, I will set up a list at W3C, however, I believe that this review will benefit from the public scrutiny available only on these lists. So, there's the mandate, who wants to help? -- -eric (eric@w3.org) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Matthew.Sergeant at eml.ericsson.se Thu Feb 25 09:42:05 1999 From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML)) Date: Mon Jun 7 17:09:27 2004 Subject: having to deal with mal-formed XML Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A164F@eukbant101.ericsson.se> > -----Original Message----- > From: Eric Prud'hommeaux [SMTP:eric@w3.org] > > On Wed, Feb 24, 1999 at 11:14:14PM -0500, Craig I. Johnson wrote: > > Someone from w3c should complain. > > Pursuant to this, I am charged with collecting a list of XML > generating and using apps and evaluating their conformance with > http://www.w3.org/TR/1998/REC-xml-19980210. The objective is not to > bash any particular vendor, but to create a forum where vendors and > the public may determine what XML applications need attention and what > ones will sail comfortably and interroperably into the next millineum. > > Conformance may be judged on these criteria (any probably more that I > haven't thought of): > > Generates well-formed XML. > Generates valid XML where a DTD is available. > Accepts well-formed/valid XML. > Follows the application rules described in the XML spec: > Produces a valid document tree when processing valid XML. > Produces an empty doc when it observes a fatal error (see > http://www.w3.org/TR/1998/REC-xml-19980210#sec-terminology). > I think most people would be happy with every XML application creating _only_ xml that conforms with the XML spec. That would make me happy anyway - perhaps I'm easily pleased ;-) Matt. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Feb 25 13:40:35 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:27 2004 Subject: Streams, protocols, documents and fragments In-Reply-To: <36D479F1.28D796D9@thinlink.com> References: <36D46640.94081620@thinlink.com> <14036.28838.719355.44002@localhost.localdomain> <36D479F1.28D796D9@thinlink.com> Message-ID: <14037.20555.720649.689770@localhost.localdomain> Tom Harding writes: > David Megginson wrote: > > > More importantly, you don't want to have to parse an entire document > > just to find out where it ends because that forces your system into > > linear processing -- on a busy server, it is absolutely necessary to > > be to isolate the documents/packets quickly and pass them off to > > separate threads (or even separate boxes) for parsing and processing. > > Good point. I have been implicitly assuming this as a cost of > moving the parsing function into the network infrastructure. > However, a general-purpose endpoint implementation would have a > hard time parallelizing in the way you describe because of possible > inter-document dependencies in the application protocol. It has to > deliver the documents to the next layer in the order in which they > were sent. If parallelism is explicitly needed then a solution is > to create multiple connections. It's not necessarily the raw XML that will be delivered to the application, however -- it could be that the XML will be preprocessed to populate an object tree, create a 3D graphic, etc., and that kind of processing can easily be done in parallel. Imagine this payload: Packet 1 of 4: RDF metadata description of payload Packet 2 of 4: XML text for a news story on Amazon.com Packet 3 of 4: XML vector-graphic format for a chart of Amazon's earnings Packet 4 of 4: Base 64-encoded graphic of Amazon's logo As soon as I receive packet #3, for example, I can hand it off to a separate process for rendering, even though I am not yet ready to pass the rendered version to the application until the whole payload is received. I can prepare a set of SQL statements to execute to add the information in #1 to my database, I can convert #2 to HTML or PDF, and I can convert #4 to a PNG, all in parallel without any interdependencies. If I have to parse all of #1 before I can start on #2, my system will be much less efficient. In other words, if the general-purpose system had some kind of typing information available, it could do many types of generic processing before worrying about dependencies. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Feb 25 13:43:15 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:27 2004 Subject: Streams, protocols, documents and fragments In-Reply-To: References: Message-ID: <14037.21093.485674.49748@localhost.localdomain> Mark Birbeck writes: > > More importantly, you don't want to have to parse an entire document > > just to find out where it ends because that forces your system into > > linear processing -- on a busy server, it is absolutely necessary to > > be to isolate the documents/packets quickly and pass them off to > > separate threads (or even separate boxes) for parsing and processing. > > You may be able to *parse* parts of the document before the entire > document has completely arrived, but it is surely wrong to *process* the > document because you don't yet know if it's well-formed or valid. Some > of the parsing you did in one process might be invalidated as the result > of another process. I think that there's a bit of confusion here -- we're talking about receiving multiple documents in the same stream. My claim is that you wouldn't want to have to do XML parsing to determine where one document ended and the next began, even if there weren't markup allowed after the end tag -- it's better to have a stream layer above the XML. That said, there are many circumstances where someone might want to process a document as it arrives -- every application and domain has different requirements. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Feb 25 13:53:26 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:27 2004 Subject: MIME types vs. DOCTYPE (was RE: ANNOUNCE: New XHTML WD) In-Reply-To: <000901be605a$10281ce0$d3228018@jabr.ne.mediaone.net> References: <14036.21688.228887.515457@localhost.localdomain> <000901be605a$10281ce0$d3228018@jabr.ne.mediaone.net> Message-ID: <14037.21377.883734.819558@localhost.localdomain> Jonathan Borden writes: > David Megginson wrote: > > Isn't it easier to identify the resource type externally so > > that it can be handed directly to the correct processor? > Assuming that HTML is defined in XML, then isn't the correct > processor the XML processor? text/xml correctly identifies the > content-type. If you make an exception for the specific XHTML DTD > then why not for every DTD! But I do think that you should make an exception for every document type -- text/xml should just be a fallback, when all else has failed. Why should there be a single processor to handle everything that happens to be encoded in XML? I don't have a single compiler for every programming language that happens to use ASCII, or a single application that processes any data that arrives in a zip file. If I have a vector graphic format that happens to use XML, I want to pass it off to a vector-graphic processor; if I have a browsable document, I want to pass it off to a browser; if I have a 3D world, I want to pass it off to a 3D renderer; if I have an e-commerce transaction, I want to pass it off to my order-processing application; etc., etc. I can imagine many circumstances where parsing the XML first to figure out what it is could be useful, but if it is already possible to know the type, then doing so is very wasteful. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From keshlam at us.ibm.com Thu Feb 25 14:31:35 1999 From: keshlam at us.ibm.com (keshlam@us.ibm.com) Date: Mon Jun 7 17:09:28 2004 Subject: Streams, protocols, documents and fragments Message-ID: <85256723.004F9142.00@D51MTA03.pok.ibm.com> Personal opinion: The right way out of the "never-ending document" problem is to declare that the stream is a stream of transaction documents, NOT a single huge document in its own right. Not everything that can (theoretically) be an XML document should be. Some should be fragments, some should be aggregations. ______________________________________ Joe Kesselman / IBM Research Unless stated otherwise, all opinions are solely those of the author. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From keshlam at us.ibm.com Thu Feb 25 14:43:08 1999 From: keshlam at us.ibm.com (keshlam@us.ibm.com) Date: Mon Jun 7 17:09:28 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) Message-ID: <85256723.004FC8EB.00@D51MTA03.pok.ibm.com> The DOM spec says that the DOM is only an API for viewing a document. The underlying object structure may or may not bear any resemblence to the DOM, but as long as the user is talking DOM they can't tell the difference and shouldn't care what the back end looks like. ______________________________________ Joe Kesselman / IBM Research Unless stated otherwise, all opinions are solely those of the author. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Thu Feb 25 14:57:32 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:28 2004 Subject: events vs callbacks (was Re: SAX2 (was Re: DOM vs. SAX??? Nah. )) In-Reply-To: <36D21E69.B78311D5@jfinity.com> References: <001301be5ea2$74d3c0e0$c9a8a8c0@thing2> <36D21E69.B78311D5@jfinity.com> Message-ID: <14037.25528.148470.652277@localhost.localdomain> Gabe Beged-Dov writes: > SAX is described as an event-based API. IMO, it is a callback based > API. Iam guessing that many will find this either a debateable > distinction or one not worth dwelling on. I feel it is worth > distinguishing. This is separate from the issue of whether it is > worthwhile to either replace the callback API with an event API, or > layer an event API on top of the callback API. Event-based programming existed before people started encapsulating events in structures or objects. I'd define SAX as an event-based API that reports events using callbacks. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From begeddov at jfinity.com Thu Feb 25 16:04:54 1999 From: begeddov at jfinity.com (Gabe Beged-Dov) Date: Mon Jun 7 17:09:28 2004 Subject: events vs callbacks (was Re: SAX2 (was Re: DOM vs. SAX??? Nah. )) References: <001301be5ea2$74d3c0e0$c9a8a8c0@thing2> <36D21E69.B78311D5@jfinity.com> <14037.25528.148470.652277@localhost.localdomain> Message-ID: <36D574DD.75EE2BF1@jfinity.com> David Megginson wrote: > Gabe Beged-Dov writes: > > > SAX is described as an event-based API. IMO, it is a callback based > > API. > > Event-based programming existed before people started encapsulating > events in structures or objects. I'd define SAX as an event-based API > that reports events using callbacks. > There are various axes to use for terminology (as many of the threads on this list show :-(). To reuse a term from the previous sentence, I would say that the key axis to use to distinguish event-based vs callback-based is "thread" of control. A callback-based API passes the thread of control to the "message" receiver. A event-based API queues the "message". The receiver uses its own thread of control to process the message. The current crop of parsers and frameworks (SAX, MDSAX) are definitely in the single threaded camp. The discussion of streaming (in other threads :-), has only brought up multiple threads of control tangentially, in one response that discussed wanting to hand off the processing of a "packet" to another process in a high volume server scenario. There is alot of discussion of data models vs API. I would like to see control flow added to the axes that people argue about :-). Gabe Beged-Dov www.jfinity.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Thu Feb 25 17:00:06 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:09:28 2004 Subject: MIME and X(HT)ML (Re: MIME types vs. DOCTYPE) Message-ID: <00c601be60e0$8d772890$3df96d8c@NT.JELLIFFE.COM.AU> The format of MIME types is given in Freed and Borenstein Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types( ftp://ftp.isi.edu/in-notes/rfc2046.txt) That RFC allows anyone to go text/X-??? application/X-??? where ??? is any name you like. For example, text/X-xml-vml or whatever. I think it would be more practical to have the XML MIME media type allow "*/xml-???" for IANA-registered DTDs and "*/ xml-X-???" for private use DTDs. That shows the defaulting OK. Maybe RFC 2046 might need to be altered accordingly to allow these kinds of subtyping. There are two interesting documents for backgrounding on MIME types: RFC "2376" XML Media Types (Whitehead & Murata) RFC "1874" SGML Media Types (Levinson) (Refer ftp://ftp.isi.edu/in-notes/ ) The most important thing about the XML types, is that they specify parseable entity transport only, *not* documents per se. So a future XHTML MIME type will also have to specify whether parseable entities or documents are being sent. (Actually, it opens up an interesting intermediate prospect: perhaps an XHTML application should accept parseable HTML entities, not complete WF document: so "a link is here" is a WF parseable entity: perhaps it would be nice for XHTML systems to accept those--it would be a little friendlier than full XML. Yikes.) During discussion for the SGML MIME types (1995), debators split into two irreconcilable camps: * the "SDIF" people, who want to be able to send documents with unreferenced entity declarations removed (i.e. do a transitive closure on the document, and send all entities referenced, perhaps to some entity-resolution depth); and * the RFC people, who wanted to send arbitrary collections of documents, even if redundant or incomplete. Just before the time of the XML MIME type RFC, I revisited the SGML debate, for ISO purposes, to see how much common ground there was and whether XML offered a chance for a new or reconciling apporach: I ended up being firmly of the opinion that only an entity-based MIME type and not a document-based MIME type was practical; it gave us everything that HTML had, and would not naturally create divisions or have some developers refusing to implement it. In any case, the text/sgml MIME type continues to exist, and the multipart type also is around tantalysingly. The writers of the XML RFC agreed with this point too, and I think it is working out OK. I wonder whether it is now time to re-open the can of worms, but with a different perspective. In particular, I think there is a need to agree on a way to specify the name of a document type. For example, allowing a DOCTYPE parameter on the MIME headers for XML and XHTML. This would point to a resource in some schema or declaration format: XML markup declarations, DDML, the new Xschema, DCD or anything. The advantage of having it in the MIME header is that the schema can be fixed, and the prolog would not override it. Some users need this. Personally, I would prefer if the MIME media type could hold all information needed on type: that works with modern browsers. Of course, as with encoding, this kind of "primary metadata" needs to have an inband notation (an extra argument on the XML PI?) as well as a MIME header equivalent. I hope the XSchema group will specify a method ASAP, in the same way that James Clark's stylesheet declarations logically precede the specification of a stylesheet language. Can the XSchema people give this serious consideration? As a further requirement, perhaps the need for bundling parseable entities into streams should also be considered: a form of multipart suitable for open-ended streams. Personally, I think it is a bad idea, because a protocol is two way, and a document is one-way. But HTTP 1.1 provides lots of nice hooks for things, they should be investigated. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rabin at shore.net Thu Feb 25 17:09:47 1999 From: rabin at shore.net (Paul Rabin) Date: Mon Jun 7 17:09:28 2004 Subject: events vs callbacks (was Re: SAX2 (was Re: DOM vs. SAX??? Nah. )) In-Reply-To: <36D574DD.75EE2BF1@jfinity.com> References: <001301be5ea2$74d3c0e0$c9a8a8c0@thing2> <36D21E69.B78311D5@jfinity.com> <14037.25528.148470.652277@localhost.localdomain> Message-ID: At 08:05 AM 2/25/99 -0800, Gabe Beged-Dov wrote: >There is alot of discussion of data models vs API. I would like to see >control flow added to the axes that people argue about :-). Yes. In particular, the current model does not support having a single application (or filter) handle input streams from multiple concurrent parsers (or filters), since the parser owns the thread of control until the parse completes. Rather than changing the callback model, we could implement a compromise solution: - parseInit() // initialize the parse state - nextParseEvent() // cause the next event to be delivered via callback All of the multi-threading logic could be hidden in a layer on top of existing parsers. - Paul Rabin xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Thu Feb 25 17:16:20 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:28 2004 Subject: events vs callbacks (was Re: SAX2 (was Re: DOM vs. SAX??? Nah. )) Message-ID: <002b01be60e0$dc774f60$c9a8a8c0@thing2> >Event-based programming existed before people started encapsulating >events in structures or objects. I'd define SAX as an event-based API >that reports events using callbacks. But why are we not taking advantage of having the events as objects? I've tried to second guess why this is so, but I think the arguments in favor of object-based events is stronger: the added overhead is balanced by greater simplicity and subsequently less overhead in other areas; the added flexability adding additional utility to all conformant code. Take for example the DOMParser. It subclasses InputSource to play some tricks. And because InputSource is an object, this lets us use DOMParser with filters that never ever considered having InputSource be anything other than what was provided with SAX. The same will be true of SAX2, if events are objects. It will dramatically increase the utility of all the conformant code! Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Thu Feb 25 17:57:48 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:09:28 2004 Subject: Well-formed vs. valid Message-ID: <87256723.006225A9.00@d53mta03h.boulder.ibm.com> >I would propose a type of XML parser that takes a well-formed or valid >document, validates it against a DTD (or any other accepted form of >structure description) of the application's choice, and then issues >streaming events to the application. Consider it a DOM that does a >tree match on an application chosen DTD and then emits SAX calls. The >application would be guaranteed to be receiving valid elements and >thus not need its own data validation code. > >The line between the application and 'XML' is currently viewed as the >application is hooked onto DOM, SAX, or some other XML parser of a >file at the level of elements. The XML structural description in a DTD >is not used, except if the document (not the application) calls for >validation. This separation is also represented by modeling on the >basis of a file rather than a stream. > FYI, our (IBM's) new version 2 architecture parsers do this. We have a pluggable architecture, and one of the plug ins is a validator. The low level scanner uses this to validate content before it sends it out through the internal even APIs. So, if you are wiring together a SAX style parser, you just wire the internal events to the SAX events and you have a validating SAX parser (actually we have that combination already provided for you as a canned parser, but you can do other variations as well.) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tomh at thinlink.com Thu Feb 25 18:34:11 1999 From: tomh at thinlink.com (Tom Harding) Date: Mon Jun 7 17:09:28 2004 Subject: Streams, protocols, documents and fragments References: <36D46640.94081620@thinlink.com> <14036.28838.719355.44002@localhost.localdomain> <36D479F1.28D796D9@thinlink.com> <14037.20555.720649.689770@localhost.localdomain> Message-ID: <36D59762.370372DB@thinlink.com> David Megginson wrote: > It's not necessarily the raw XML that will be delivered to the > application, however How about we create an architecture where it is? I have a sample implementation of XP where an endpoint fires an event when it has received a document. The event wraps a DOM Document, so the XML has been parsed, but any real processing, such as rendering, is still up to the application. This architecture doesn't really work with SAX, since you might attach long-running code to a handler. In this architecture, events don't occur at the element level; they correspond to the network-level event of receiving a complete document. You have lots of document "packets" flying around. What you do with them is up to you. Tom Harding xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Thu Feb 25 18:39:28 1999 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 17:09:28 2004 Subject: ANNOUNCE: RDF DTD available Message-ID: <001f01be60ee$5a885680$16f96d8c@NT.JELLIFFE.COM.AU> The new RDF specification from W3C at http://www.w3.org/TR/REC-rdf-syntax/ does not include XML markup declarations. I have made up a set and put them on the "XML and SGML Resources" page of the "Chinese XML Now!" website: http://xml.ascc.net/xml/en/utf-8/resource_index.html Rick Jelliffe Academia Sinica Computing Centre ricko@gate.sinica.edu.tw xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tug at wilson.co.uk Thu Feb 25 19:23:51 1999 From: tug at wilson.co.uk (John Wilson) Date: Mon Jun 7 17:09:28 2004 Subject: Proposal to standardise a Java API Message-ID: <006301be60f4$52fd95f0$010a0a0a@home.wilson.co.uk> http://developer.javasoft.com/developer/jcp/jsr_xml.html Sun have just published a proposal to standardise a Java parser API. The proposal, such as it is, is available at the URL above. (You will need to register for there bl**dly silly Java Developer Connection to actually read it - sorry). I'm mildly surprised that nobody from Sun has had the good manners to post an announcement here. You have 7 days to comment. John Wilson The Wilson Partnership 5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK +44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax) Mailto: tug@wilson.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Thu Feb 25 19:29:10 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:28 2004 Subject: Layered Model for XML Message-ID: <199902251928.OAA27754@hesketh.net> I finally revised that rough draft of "Toward A Layered Model for XML", producing a final version that I hope presents some architectures people will like. It's inspired in large part by SAX filter architectures, but I tried to keep it general enough that it could fit most XML processing scenarios. Anyway, if folks are interested (there are pictures, promise!) it's at: http://www.simonstl.com/articles/layering/layered.htm Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Thu Feb 25 19:34:36 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:28 2004 Subject: Well-formed vs. valid In-Reply-To: <87256723.006225A9.00@d53mta03h.boulder.ibm.com> Message-ID: <199902251933.OAA27896@hesketh.net> At 10:51 AM 2/25/99 -0700, roddey@us.ibm.com wrote: >FYI, our (IBM's) new version 2 architecture parsers do this. We have a >pluggable architecture, and one of the plug ins is a validator. The low >level scanner uses this to validate content before it sends it out through >the internal even APIs. So, if you are wiring together a SAX style parser, >you just wire the internal events to the SAX events and you have a >validating SAX parser (actually we have that combination already provided >for you as a canned parser, but you can do other variations as well.) Big question: can I plug someone else's SAX parser into your scanner, and then have your validation component work on my SAX events? While it's unlikely that I'd want to plug a different SAX parser in, it's quite possible that I'd want to work with the SAX events (transforming with XT, for instance) before performing validation. Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Rudyard.Merriam at COMPAQ.com Thu Feb 25 20:10:34 1999 From: Rudyard.Merriam at COMPAQ.com (Merriam, Rudyard) Date: Mon Jun 7 17:09:28 2004 Subject: Architecture docs for SAX Message-ID: Are there architecture documents for SAX or other parsers? Something I could use if I wanted to roll my own in C++? Rud Merriam KD5DTV Compaq, Best Managed 281-514-3252 Platforms and Solutions > rudyard.merriam@compaq.com > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Thu Feb 25 21:19:47 1999 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:09:28 2004 Subject: Proposal to standardise a Java API References: <006301be60f4$52fd95f0$010a0a0a@home.wilson.co.uk> Message-ID: <36D5BD0C.BBE67878@eng.sun.com> John Wilson wrote: > > http://developer.javasoft.com/developer/jcp/jsr_xml.html > > Sun have just published a proposal to standardise a Java parser API. > The proposal, such as it is, is available at the URL above. The proposal is to start the discussions for what the platform level API should be ... it's not a specific API proposal. Though it does sketch an initial set of functionality, scoped by SAX, DOM, and some appropriate extensions. Point being, the open process Sun is following in support of its ISO/PAS process doesn't allow concrete proposals to be published quite yet -- they've got to be made as part of that process. http://developer.javasoft.com/developer/jcp/ > I'm mildly surprised that nobody from Sun has had the good manners > to post an announcement here. You beat me to it, that's all! However, the intention to start this process has been mentioned in the past ... and I told folk to watch the URL above quite recently. Quite a few folk from XML-DEV have asked for such an API standardization process to start. > You have 7 days to comment. - Dave P.S.: Note that due to the current high volume of XML-DEV, I read this list infrequently. If you'd like me to receive comments (and perhaps respond to them) please e-mail me directly. Don't CC the XML-DEV list, my mail filtering configuration doesn't understand such nuances (unfortunately). xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tug at wilson.co.uk Thu Feb 25 21:38:08 1999 From: tug at wilson.co.uk (John Wilson) Date: Mon Jun 7 17:09:28 2004 Subject: Proposal to standardise a Java API Message-ID: <002001be6107$20d8ff70$010a0a0a@home.wilson.co.uk> >> I'm mildly surprised that nobody from Sun has had the good manners >> to post an announcement here. > >You beat me to it, that's all! However, the intention to start this >process has been mentioned in the past ... and I told folk to watch >the URL above quite recently. Quite a few folk from XML-DEV have asked >for such an API standardization process to start. The spec was filed on Feb 23rd. It's Feb 25th now (nearly Feb 26th in the UK). Actually we have 4 days to respond not 7. I agree that such a standardization os way past due - but not by stealth. >P.S.: > >Note that due to the current high volume of XML-DEV, I read this >list infrequently. That much is clear John Wilson The Wilson Partnership 5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK +44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax) Mailto: tug@wilson.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ralph at fsc.fujitsu.com Fri Feb 26 00:22:53 1999 From: ralph at fsc.fujitsu.com (Ralph Ferris) Date: Mon Jun 7 17:09:29 2004 Subject: ANNOUNCE: "HyBrick" Browser, V0.82 Message-ID: <3.0.5.32.19990225162035.0096a710@pophost.fsc.fujitsu.com> All, The latest version of Fujitsu's "HyBrick" browser, V0.82, with expanded support for XLink/XPointer, is now available from Fujitsu Software Corporation's Web site at: http://www.fsc.fujitsu.com/hybrick/ New XLink/XPointer related features include: - XLink/XPointer error/warning info is shown in the error list dialog. Note: The HyBrick V0.8 XPointer interpreter permits whitespace to be included in an XPointer string - illegal, according to the XPointer WD. The XPointer interpreter has been fixed accordingly and XPointers containing whitespace will now generate error messages. - A "Document Group" sub-menu has been added in the "XLink/XPointer" menu. Users can now navigate between inter-linked documents by using Document Groups as well as through individual links. (Right mouse-button click on links to see this menu.) - In the "select link" dialog, link element "role" values are displayed instead of GIs. This feature, as well as the "Document Group" display feature, are particularly useful for creating and navigating "Topic Maps." - The mouse cursor now changes its shape over links. Other new features: - If multiple stylesheet PIs are present, users are presented with a dialog box to select the stylesheet they want to use. This feature, which was demonstrated in the first version of HyBrick, has been re-enabled in this one. - "Reload hubdocument" and "Close window" functions have been added. Installing and Using HyBrick: - HyBrick is supplied as a self-extracting file for Windows 95 and Windows NT. - Once the files are installed, start HyBrick from the bin directory. - Use the "Browse" button to locate the file Samples\XLink-sample\readme.xml or Samples\XLink-sample\leeme.xml (Spanish). Note: Our thanks to Sr. Diego Antona Archilla, UNAM, for providing Spanish-language translations of the HyBrick sample files. Best regards, Ralph E. Ferris Fujitsu Software Corporation xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Fri Feb 26 01:44:10 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:09:29 2004 Subject: Content-Document-Type: was (Re: MIME types vs. DOCTYPE) In-Reply-To: <00c601be60e0$8d772890$3df96d8c@NT.JELLIFFE.COM.AU> Message-ID: <000101be6128$bf4af7a0$d3228018@jabr.ne.mediaone.net> I am proposing that xml continue to use the media type: text/xml or application/xml and that the header Content-Document-Type: be used to denote the particular document type. The value of Content-Document-Type is proposed to be a URI, in the same fashion that a URI specifies an XML namespace. Rick Jellife wrote: > > > The format of MIME types is given in Freed and Borenstein > Multipurpose Internet Mail Extensions (MIME) Part Two: > Media Types( ftp://ftp.isi.edu/in-notes/rfc2046.txt) > > That RFC allows anyone to go > text/X-??? > application/X-??? > where ??? is any name you like. For example, text/X-xml-vml or whatever. yes, so unless registered via the IANA text/xhtml is not legal wrt RFC 2046. Official XML document types will need to be registered with the IANA, the W3 does not control the MIME content-type domain. Overloading Content-Type in such fashion is bad for several reasons: 1) official document types need to be registered via the IANA 2) there exists the same problem with namespace collisions using the unofficial text/x-subtype specification that would exist if XML namespaces were to be defined by the prefix and not the URI. Use of text/x-xml-xxx is not a robust document specification scheme in the situation where thousands or millions of distinct document types may be defined. and Dave Megginson wrote: > But I do think that you should make an exception for every document > type -- text/xml should just be a fallback, when all else has failed. > Why should there be a single processor to handle everything that > happens to be encoded in XML? I don't have a single compiler for > every programming language that happens to use ASCII, or a single > application that processes any data that arrives in a zip file. True but MIME types denote specific encodings such as application/x-gzip. >If I have a vector graphic format that happens to use XML, I want to >pass it off to a vector-graphic processor; if I have a browsable >document, I want to pass it off to a browser; if I have a 3D world, I >want to pass it off to a 3D renderer; if I have an e-commerce >transaction, I want to pass it off to my order-processing application; >etc., etc. This is a reasonable request. One could make the argument that standards which happen to be XML ought be described by unique and registered MIME types. The problem is with the proliferation of type names. The other and perhaps more important problem is that no MIME type denotes XML data as opposed to TEXT data, so in the absence of specific knowledge of a particular media type, text/xxx is to be treated as text/plain per RFC 2046. Perhaps xml ought be a top level type, but then ought sgml be as well and other well known text formats? >I can imagine many circumstances where parsing the XML first to figure >out what it is could be useful, but if it is already possible to know >the type, then doing so is very wasteful. This is also reasonable and Content-Document-Type solves this problem. I see no good reason not to use a specific header to solve this problem. Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Fri Feb 26 02:03:04 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:09:29 2004 Subject: Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999) In-Reply-To: <199902242250.QAA03538@bruno.techno.com> Message-ID: <000201be612b$6097c640$d3228018@jabr.ne.mediaone.net> Steven R. Newcomb wrote: > > In property sets there are never any methods > whatsoever. This point is emphasized by the fact > that, in the grove paradigm, the information > components are called "nodes" rather than > "objects". If you choose to instantiate a grove > as a collection of objects (as many reasonable > people, including those at my own company, > certainly would), that's OK, but the fundamental > abstraction does not have the concept of methods. > ... ... if any of the following statements > are true: > > * the information may outlast existing processing > systems, > > * the information may have unforeseen uses in an > ever-changing world, and > > * the information must be interchanged in an open, > multivendor environment. > > Instead of encapsulating such information in > methods, as objects often do, we need to > encapsulate it in semantics, as XML can be used to > do. Having rendered the information as XML, and > having chosen appropriate semantic-bearing tags > and other attributes for its various components, > we now have the information in a totally useless > but highly interchangeable form that can become > input to any application for any purpose, > including unforeseen purposes. > Well put. Several years ago, I thought that the ideal design for a healthcare system was object oriented with strict interfaces between components (nothing works together today without much to much time spent on integration). More recently, I think that it is even more important that the information (e.g. the lifetime medical record) be represented in an SGML/XML format and that this be mandated. Your arguments reinforce these beliefs. It was noted some days ago in a discussion in regard to databases that the success of the relational database was the ability to formally describe the system. Perhaps it the property set/grove formalism will serve the same purpose for XML. The ability to apply declarative transformations (e.g. XSL) to 'grove like' structures replaces much of the need to define algorithms (i.e. object methods). My experience with programming is that the majority consists of transforming data from the format returned from one API into a format appropriate for another API, so this type of analysis is poised to save us alot of work! Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From begeddov at jfinity.com Fri Feb 26 04:48:54 1999 From: begeddov at jfinity.com (Gabe Beged-Dov) Date: Mon Jun 7 17:09:29 2004 Subject: events vs callbacks (was Re: SAX2 (was Re: DOM vs. SAX??? Nah. )) References: <002b01be60e0$dc774f60$c9a8a8c0@thing2> Message-ID: <36D627FD.9D725C0F@jfinity.com> Bill la Forge wrote: > >Event-based programming existed before people started encapsulating > >events in structures or objects. I'd define SAX as an event-based API > >that reports events using callbacks. > > But why are we not taking advantage of having the events as objects? > I've tried to second guess why this is so, but I think the arguments in > favor of object-based events is stronger: the added overhead is balanced > by greater simplicity and subsequently less overhead in other areas; the > added flexability adding additional utility to all conformant code. I'll play devil's advocate in order to keep this thread going :-). The two reasons I can see for prefering the current API are: 1) Good old installed base argument (doubly so with the SUN XML direction). 2) Efficiency. The first is a classic trade-off of increased flexibility and maintainability in the future vs. transition costs in the present. It can be handled with the usual backwards compatability approaches. The second can be dealt with by deciding if you want to provide an event object vs an event interface. The event object requires SAX to do a deep copy of the event information in order to decouple it from the parser. If you just want the benefits that Bill is advocating, you can go with the event interface and allow the same tricks that are currently used. You could then support the deep copy in the helpers package. The next step would be to use theEvent interface at the "raw" API level in order to obtain the extensibility without sacrificing efficiency and support an event queuing layer above it where streaming application integration is available. Flexible control flow integration is very hard when everybody thinks they're in charge. An event interface (with an event object layer above it) makes this integration easier. > The same will be true of SAX2, if events are objects. It will dramatically increase > the utility of all the conformant code! I agree. If you document that SAX owns the event and provide convenience routines to do a deep copy, it seems like you can have the best of both worlds. Gabe Beged-Dov www.jfinity.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From db at Eng.Sun.COM Fri Feb 26 06:48:40 1999 From: db at Eng.Sun.COM (David Brownell) Date: Mon Jun 7 17:09:29 2004 Subject: Well-formed vs. valid References: <199902251933.OAA27896@hesketh.net> Message-ID: <36D64352.3227CDBE@Eng.Sun.COM> Simon St.Laurent wrote: > > Big question: can I plug someone else's SAX parser into your scanner, and > then have your validation component work on my SAX events? While it's > unlikely that I'd want to plug a different SAX parser in, it's quite > possible that I'd want to work with the SAX events (transforming with XT, > for instance) before performing validation. Well, Sun's current package (released today with source :-) has some infrastructure along those lines ... but it's a bit more integrated with the DOM than with the SAX bits. That's to keep apps from needing to do the sort of tree structure tracking that DOM already does. Since it's in the DOM side, you can plug any SAX parser in to it. Specifically, when DOM nodes are plugged into the tree there are certain clearly defined callbacks which are made: the "XmlReadable" interface. When those callbacks are made, the "parent" context is visible as a DOM tree that doesn't fully correspond to the input document. It seems that one of the more popular methods is doneParse(), called when the last of an element's children is parsed. Also startParse() gets used to do resolution of relative URLs (e.g. in external entities). Those callbacks are intended for performing various application level work such as integrity checks. Applications can provide diagnostics with the same level of specificity (line, file, etc) as the parser itself; in some cases such diagnostics will be reported directly through the SAX error handler used by the parser. - Dave xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From chtino at hnc.co.kr Fri Feb 26 10:02:52 1999 From: chtino at hnc.co.kr (Chung, Byung Hee) Date: Mon Jun 7 17:09:29 2004 Subject: XML with CSS in IE5 Message-ID: <36D670BB.E3CCD199@hnc.co.kr> Hello, world IE5 supports XML. In IE5, XML document without stylesheet is displayed as tree. I want to see XML document with CSS stylesheet in IE5. How can I link CSS information to XML document ? Thank you. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Feb 26 11:55:13 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:29 2004 Subject: events vs callbacks (was Re: SAX2 (was Re: DOM vs. SAX??? Nah. )) In-Reply-To: <002b01be60e0$dc774f60$c9a8a8c0@thing2> References: <002b01be60e0$dc774f60$c9a8a8c0@thing2> Message-ID: <14038.34584.490425.124743@localhost.localdomain> Bill la Forge writes: > >Event-based programming existed before people started > >encapsulating events in structures or objects. I'd define SAX as > >an event-based API that reports events using callbacks. > > But why are we not taking advantage of having the events as > objects? I've tried to second guess why this is so, but I think > the arguments in favor of object-based events is stronger: the > added overhead is balanced by greater simplicity and subsequently > less overhead in other areas; the added flexability adding > additional utility to all conformant code. >From a design-pattern perspective, Bill is absolutely right: encapsulation and abstraction are big winners, and the way that he suggests is the proven method for building a robust system. Rightly or wrongly, however, SAX 1.0 decided to err on the side of simplicity and small size: while I agree that event objects could be implemented to run nearly as fast as the current scheme (so close that the difference would be negligible), there would have been other hidden costs: 1. The byte-size of the SAX interface JAR itself would have been much larger, maybe twice or three times as large because of all the additional *.class files. This may not seem to matter for most of the computing applications that are using SAX right now, but SAX is intended to be used eventually in low-bandwidth applications (i.e. over a wireless modem) and low-resource applications (i.e. on a palmtop), where every byte counts. When a programmer is optimising for size, using XML at all is a hard sell if the XML parser adds 25-500K to the application size; even though it's small, SAX adds even more, and I wanted to give it the best chance of slipping through. 2. Event objects greatly increase the initial learning curve, though they make it easier later on -- that's why so few people in the SGML world ever learned SP's (quite good) interfaces. When we were designing SAX, we wanted to give it every chance we could -- it was not obvious at the time that SAX would be successful, so we had to make certain that coders without deep OO experience could learn it quickly, at a single glance, before they lost interest and moved on to something else. 3. Event objects greatly increase the burden of documentation -- chapters on SAX in XML books would have to be much longer, since every event object would need its own section; likewise, people would have to do more clicking around in the JavaDoc pages (from DocumentHandler to StartElementEvent back to DocumentHandler to EndElementEvent back to DocumentHandler to CharactersEvent etc.). In other words, SAX is a classic example of the worse-is-better approach to software design, the kind that almost always wins, much to the annoyance of people like me who really do like good design patterns and robust, well-abstracted systems. In two or three years, I fully expect to see people (unknowingly) using SAX on the next-generation of palmtops with wireless Internet access, choosing a new sweater to buy from Eddie Bauer while they're riding the bus back to campus -- at least, I hope to see that, as long as we're smart enough not to kill off SAX's original advantages in round two. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Feb 26 11:58:27 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:29 2004 Subject: Streams, protocols, documents and fragments In-Reply-To: <36D59762.370372DB@thinlink.com> References: <36D46640.94081620@thinlink.com> <14036.28838.719355.44002@localhost.localdomain> <36D479F1.28D796D9@thinlink.com> <14037.20555.720649.689770@localhost.localdomain> <36D59762.370372DB@thinlink.com> Message-ID: <14038.35650.792155.191827@localhost.localdomain> Tom Harding writes: > David Megginson wrote: > > > It's not necessarily the raw XML that will be delivered to the > > application, however > > How about we create an architecture where it is? I have a sample > implementation of XP where an endpoint fires an event when it has > received a document. The event wraps a DOM Document, so the XML > has been parsed, but any real processing, such as rendering, is > still up to the application. It all depends on your layering approach: personally, if typing information is available, I'd rather use that to build optimised internal representations first and hand those off to the application -- a general-purpose DOM would be *extremely* inefficient for handling things like vector graphics or 3D worlds (to name only two), though it is always possible to expose their optimised object models through a DOM interface later if necessary. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simonstl at simonstl.com Fri Feb 26 13:16:04 1999 From: simonstl at simonstl.com (Simon St.Laurent) Date: Mon Jun 7 17:09:29 2004 Subject: XML with CSS in IE5 In-Reply-To: <36D670BB.E3CCD199@hnc.co.kr> Message-ID: <199902261314.IAA11113@hesketh.net> At 07:00 PM 2/26/99 +0900, Chung, Byung Hee wrote: >IE5 supports XML. >In IE5, XML document without stylesheet is displayed as tree. >I want to see XML document with CSS stylesheet in IE5. >How can I link CSS information to XML document ? You need to use a processing instruction like the one below: Where the value of the href is a URI pointing to your stylesheet. It goes in the prolog (after the declaration but before the first element). You can also add a title and rel pseudo-attributes to identify further style sheet properties, though applications don't seem to have caught on to that. See the proposed recommendation (still subject to change) at: http://www.w3.org/TR/PR-xml-stylesheet It works fine with both IE5b2 and Netscape's Gecko. Simon St.Laurent XML: A Primer / Building XML Applications (April) Sharing Bandwidth / Cookies http://www.simonstl.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Fri Feb 26 13:21:39 1999 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:09:29 2004 Subject: recursing root element Message-ID: <007701be618a$c02cdfa0$aaaddccf@ix.netcom.com> Aquick question is some text i.e. a recursing root element, a legal XML document, >From my reading of section 2.1 it isn't but the MSXML parser lets it stand, so I'm probabably wrong. Frank Frank Boumphrey XML and style sheet info at Http://www.hypermedic.com/style/index.htm Author: - Professional Style Sheets for HTML and XML http://www.wrox.com CoAuthor: XML applications from Wrox Press, www.wrox.com Author: Using XML on the Web (March) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bckman at ix.netcom.com Fri Feb 26 13:22:29 1999 From: bckman at ix.netcom.com (Frank Boumphrey) Date: Mon Jun 7 17:09:29 2004 Subject: XML with CSS in IE5 Message-ID: <007601be618a$bf41e360$aaaddccf@ix.netcom.com> >How can I link CSS information to XML document ? See http://www.w3.org/TR/PR-xml-stylesheet Frank Frank Boumphrey XML and style sheet info at Http://www.hypermedic.com/style/index.htm Author: - Professional Style Sheets for HTML and XML http://www.wrox.com CoAuthor: XML applications from Wrox Press, www.wrox.com Author: Using XML on the Web (March) ----- Original Message ----- From: Chung, Byung Hee To: Sent: Friday, February 26, 1999 5:00 AM Subject: XML with CSS in IE5 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From crism at oreilly.com Fri Feb 26 13:23:05 1999 From: crism at oreilly.com (Chris Maden) Date: Mon Jun 7 17:09:30 2004 Subject: XML with CSS in IE5 In-Reply-To: <36D670BB.E3CCD199@hnc.co.kr> (chtino@hnc.co.kr) Message-ID: <199902261320.IAA05100@ruby.ora.com> [Chung, Byung Hee] > IE5 supports XML. > In IE5, XML document without stylesheet is displayed as tree. I > want to see XML document with CSS stylesheet in IE5. How can I link > CSS information to XML document ? See . Generally, xml-dev is for developers of XML software or applications; for content and other general questions, XML-L is preferred (send e-mail to listserv@listserv.heanet.ie or your nearest local LISTSERV with no subject and body "subscribe xml-l My Full Name"). -Chris -- http://www.oreilly.com/people/staff/crism/ +1.617.499.7487 90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From murata at apsdc.ksp.fujixerox.co.jp Fri Feb 26 13:25:33 1999 From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto) Date: Mon Jun 7 17:09:30 2004 Subject: Content-Document-Type: was (Re: MIME types vs. DOCTYPE) Message-ID: <199902261322.AA03654@murata.apsdc.ksp.fujixerox.co.jp> I am a co-author of RFC 2376 (XML media types). I am attaching two of my e-mails about text/xml and application/xml. I am quite sympathetic to Jonathan, but I do not think that the URI of the DTD is always appropiate. Tim's suggestion (a namespace URI plus the root element type) sounds very interesting. Since so many people (MIME, HTTP, XHTML, XML, ...) are very much interested, I am not sure if xml-dev the right place to make the final call. I still do not know where is the right place. But I promise to read discussion at xml-dev carefully. Cheers, Fuji Xerox Information Systems Makoto ---------------------------------------------------------------------- Mail 1. Regarding additional parameters for text/xml and application/xml, there have been some private discussion between Paul Hoffman, Frank Dawson and me (see Mail 2). I have come to like the idea of adding two optional parameters to text/xml and application/xml. They are "profile" and "map". The "profile" parameter specifies a URI, which indicates the profile (e.g., XHTML, SMIL, and MathML) of the XML document. This URI may coincide with the URI of the DTD but may differ. By introducing this parameter, W3C can avoid the task of registering many MIME subtypes. The "map" parameter specifies the converstion table from the specified charset to Unicode. It appears that most existing charsets have more than one possible conversion tables and it is a good idea to solve this conversion issue at one place. Issue 1: Long time ago, Michael said that having special types (e.g., text/xhtml) helps negotiation. Does the "profile" parameter make such negotiation impossible? Issue 2: Some classes of XML applications might require more additional information. (Another parameter was proposed by Frank Dawson.) Issue 3: Confusion between the namespace URI, schema URI, DTD URI, and this URI. Issue 4: Adding the "map" pseudo-attribute to encoding declarations. Issue 5: Do we need another parameter whose value is either "external dtd subset", "external parameter entity", "external parsed entity", or "document entity"? (Note that some MIME entity is an external parsed entity AND a document entity at the same time. Some external DTD subsets can also be used as external parameter entities.) Cheers, Makoto ------------------------------------------------------------------------- Mail 2 Needs more information in the MIME header! MURATA Makoto, Paul Hoffman, Frank Dawson, Jim Whitehead 1. Problem statement We would like the MIME parser to be able to dispatch different sorts of XML documents to different applications, such as specialized programs that handle just one type of XML document. Because MIME parsers do not look inside the MIME parts, identifiying the sort of documents must be done in the MIME headers. However, neither text/xml nor application/xml allow such information. 2. Possible solutions Three approaches have been proposed. They are (1) specialized media types such as text/calendar, (2) a top-level media type xml and its subtypes such as xml/calendar, and (3) a new parameter "externalid" of text/xml and applcation/xml. (1) Specialized mime types For each specialized applications of XML, we introduce a new subtype. It may introduce more parameters and might even have some added security consideration. This is the approach that has been assumed by the authors of RFC2376. Pros: Each application will be documented by some RFC. Cons: When the MIME parser does not know of such a subtype, the only available fallback is text/plain or application/octet-stream. That is, the MIME parser cannot invoke generic XML parsers/viewers, but has to display the document as a plain text file or save the document in a file. Each specialized application will require a new subtype registration, which takes a lot of time and therefore can have long delays. (2) New top-level media type xml and its subtypes Pros: Fallback to "xml/plain" allows the use of generic XML parsers/viewers. We can also lift the line termination rule of the top-level media type "text". Cons: It is extremely difficult to register a new top-level media type and therefore can have long delays (practically, who wants to do this?). The default behavior is probably not a good enough reason. Each specialized application will require a new subtype registration, which takes a lot of time and therefore can have long delays. (3) A new parameter "externalid" for text/xml and application/xml This parameter specifies the externalID from the DOCTYPE of the XML document (if the DOCTYPE is present). Examples would be: Content-type: text/xml; externalid="http://www.foo.com/whizzy.dtd" or Content-type: application/xml; charset="utf-16be"; externalid="-//IETF//DTD vCard v3.0//EN" Pros: This is probably the easiest solution which also provides XML-specific fallback. Requires no registration with a central authority. Other parameters can be added in the future when we have new schemata or when we find new usage patterns for DTDs. The definitions for those parameters can define which sets of parameters can appear together. Cons: DTD's do not necessarily exist. For example, RFD metadata do not have DTD's. The use of DTD's to choose applications might be an abuse of DTD's. Moreover, some DTD's might be handled by many different programs on a system, such as by a specialized processor and one or more XML browsers. On the other hand, some applications (such as XML browsers) handle a variety of DTD's. There will be new schemata that will probably overshadow DTD's, and these schemata may not use externalIDs the same way they are used in today's DTDs. (4) Yet another parameter "optinfo" or "ADD-PARAM" On top of (3), provide yet another parameter "optinfo" (list of name-value pairs) or "ADD-PARAM" (plain text) for additional information. Pros: Some applications of XML require even more appliction-specfic information so as to launch appropriate software tools. For example, if iCalendar information is captured by an XML DTD, the text/xml or application/xml MIME header has to mimick "method", "component", and "optinfo" of text/icalendar. (The latest internet draft for text/icalendar is available at: ftp://ftp.isi.edu/internet-drafts/draft-ietf-calsch-ical-12.txt) If this parameter is not available, people will abuse the parameter "externalid" by providing different names for the same DTD so as to express more information. This parameter stops such abuse. Cons: The "optinfo" parameter makes it difficult for a simple MIME parser to know what to expect in the parameter. The "ADD-PARAM" parameter does not have this problem, but does not have enough expressiveness. Note: None of the above proposals handle non-monolithic XML documents very well, since different islands of non-monolithic XML documents belong to different namespaces and thus different schemata. For example, the MIME parser cannot invoke vCard applications if the vCard is embeded by the namespace mechanism. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Fri Feb 26 13:56:54 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:30 2004 Subject: events vs callbacks Message-ID: <00f301be618f$1dcacc40$c9a8a8c0@thing2> From: David Megginson >In two or three years, I fully expect to see people (unknowingly) >using SAX on the next-generation of palmtops with wireless Internet >access, choosing a new sweater to buy from Eddie Bauer while they're >riding the bus back to campus -- at least, I hope to see that, as long >as we're smart enough not to kill off SAX's original advantages in >round two. David, It is hard to make a definitive case when there are so many right answers. And with monolithic parsers, you are right. But with event objects, we can go with a layered approach and achieve a better integration between applications and parsers. And in palmtops, you know those applications will need that close integration just to keep the footprint small. So for the very applications you are concerned with, the larger jar file needed by an event-object approach could be offset by that tighter integration. I hate bloat, but there's more than one way it creeps into the code. MDSAX would be smaller with event objects. Total application bloat is significantly reduced when you can increase the utility of the components--you have just cut out a lot of duplication and work-arounds! And with the continued pace of hardware, the real issue isn't the size of a resident jar file, but the size of the jar files that need to be downloaded. If we can increase the utility of our components, then we will likely have those components resident on the palmtop. And the applications being downloaded will be ever so much smaller. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simpson at polaris.net Fri Feb 26 13:59:00 1999 From: simpson at polaris.net (John E. Simpson) Date: Mon Jun 7 17:09:30 2004 Subject: XML with CSS in IE5 Message-ID: <3.0.32.19990226085416.007e32c0@polaris.net> At 07:00 PM 2/26/99 +0900, Chung, Byung Hee wrote: >In IE5, XML document without stylesheet is displayed as tree. >I want to see XML document with CSS stylesheet in IE5. >How can I link CSS information to XML document ? Include the following line at the head of your document: replacing "http://your.css.URI/here.css" with, of course, the URI of the stylesheet. [Warning: IE5's support for CSS is still rather selective. A list of features supported and non-supported is at http://www.microsoft.com/workshop/author/css/reference/attributes.asp, but the list of non-supported ones doesn't tell the whole story -- at least if you want to use CSS2 rather than CSS1.] ============================================================= John E. Simpson | It's no disgrace t'be poor, simpson@polaris.net | but it might as well be. | -- "Kin" Hubbard xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rbourret at ito.tu-darmstadt.de Fri Feb 26 14:11:52 1999 From: rbourret at ito.tu-darmstadt.de (Ronald Bourret) Date: Mon Jun 7 17:09:30 2004 Subject: recursing root element Message-ID: <01BE6199.72E121C0@grappa.ito.tu-darmstadt.de> Frank Boumphrey wrote: > is > some text > > i.e. a recursing root element, a legal XML document, > > >From my reading of section 2.1 it isn't but the MSXML parser lets it stand, > so I'm probabably wrong. It is well-formed and therefore legal. With a DTD of: ]> it would also be valid. Note that there is a difference between a root element and a root element type. XML documents have both. The root element is the outermost element and is discussed in section 2.1. In this case, it is the outer element. The root element type is the type of the root element and is declared in a DOCTYPE statement; it is discussed in section 2.8. Your example did not specify a root element type; the DOCTYPE statement I added declares it to be xdoc. There is nothing to stop elements with the root element type from occurring elsewhere in the document. -- Ron Bourret xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jborden at mediaone.net Fri Feb 26 14:29:33 1999 From: jborden at mediaone.net (Jonathan Borden) Date: Mon Jun 7 17:09:30 2004 Subject: Content-Document-Type: was (Re: MIME types vs. DOCTYPE) In-Reply-To: <199902261322.AA03654@murata.apsdc.ksp.fujixerox.co.jp> Message-ID: <000001be6193$89002f10$d3228018@jabr.ne.mediaone.net> MURATA Makoto wrote: > > > I am a co-author of RFC 2376 (XML media types). I am attaching two > of my e-mails about text/xml and application/xml. > > I am quite sympathetic to Jonathan, but I do not think that the URI > of the DTD is always appropiate. Tim's suggestion (a namespace URI plus > the root element type) sounds very interesting. I was not suggesting that the URI be the URI of the DTD, rather that this be a 'standalone' URI in the same fashion of the namespace URI. This provides the same type of unique document type identification as does the namespace URI without predicating the existence of a DTD. In some cases the DTD URI might be appropriate, in other cases a schema URI might be appropriate, in other cases a uuid. In your attached e-mails there are several excellent solutions to this problem in a similar vein. Use of a Content-Type parameter for text/xml and application/xml has the same expressiveness as a distinct header, and since we are in reality subtyping the content type this is perhaps more appropriate, and would easily fit into RFC 2376. I assume that this parameter could be used with other content-types such as text/sgml? Jonathan Borden http://jabr.ne.mediaone.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jtauber at jtauber.com Fri Feb 26 14:31:00 1999 From: jtauber at jtauber.com (James Tauber) Date: Mon Jun 7 17:09:30 2004 Subject: recursing root element Message-ID: <00e101be6194$4812d380$0300000a@othniel.cygnus.uwa.edu.au> >Aquick question > >is >some text > >i.e. a recursing root element, a legal XML document, > >>From my reading of section 2.1 it isn't but the MSXML parser lets it stand, >so I'm probabably wrong. Yes, it's a legal (well-formed) XML document. I think the reason 2.1 might have confused you is that you are taking "element" as if it were "element type". Section 2.1 says there is only one *element* that is the root, not that there is only one *element type*. The one *element* that is the root can be of the same type as other elements in the document (as in your example). Hope this helps. James -- James Tauber / jtauber@jtauber.com / www.jtauber.com Associate Researcher, Electronic Commerce Network Curtin University of Technology, Perth, Western Australia Full-day XML Tutorial @ WWW8 : http://www8.org/ Maintainer of : www.xmlinfo.com, www.xmlsoftware.com and www.schema.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jonathan at texcel.no Fri Feb 26 14:56:51 1999 From: jonathan at texcel.no (Jonathan Robie) Date: Mon Jun 7 17:09:30 2004 Subject: XML with CSS in IE5 In-Reply-To: <199902261314.IAA11113@hesketh.net> References: <36D670BB.E3CCD199@hnc.co.kr> Message-ID: <3.0.3.32.19990226095617.00b69690@pop.mindspring.com> At 08:17 AM 2/26/99 -0500, Simon St.Laurent wrote: >At 07:00 PM 2/26/99 +0900, Chung, Byung Hee wrote: >>IE5 supports XML. >>In IE5, XML document without stylesheet is displayed as tree. >>I want to see XML document with CSS stylesheet in IE5. >>How can I link CSS information to XML document ? > >You need to use a processing instruction like the one below: > Now suppose I want to use XSL to do an initial transformation, then format the results with CSS (until we get XSL formatting implemented in our web browsers). How would I do that? Jonathan jonathan@texcel.no Texcel Research http://www.texcel.no xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tomh at thinlink.com Fri Feb 26 16:06:37 1999 From: tomh at thinlink.com (Tom Harding) Date: Mon Jun 7 17:09:30 2004 Subject: Streams, protocols, documents and fragments References: <36D46640.94081620@thinlink.com> <14036.28838.719355.44002@localhost.localdomain> <36D479F1.28D796D9@thinlink.com> <14037.20555.720649.689770@localhost.localdomain> <36D59762.370372DB@thinlink.com> <14038.35650.792155.191827@localhost.localdomain> Message-ID: <36D6C618.D44846B6@thinlink.com> David Megginson wrote: > -- a general-purpose DOM would be *extremely* inefficient for handling > things like vector graphics or 3D worlds (to name only two), though it > is always possible to expose their optimised object models through a > DOM interface later if necessary. In lots of applications, the data can't stay in an XML representation for very long anyway, because of what you're integrating it with/displaying it on/routing it through/converting it to/storing it in/etc... I view the DOM as a standard, OO way of manipulating the contents of a document. It lets applications get work done, even without taking an end-to-end OO approach. Perhaps I'm showing my bias here ;D xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Fri Feb 26 17:09:46 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:30 2004 Subject: Architecture docs for SAX In-Reply-To: References: Message-ID: <14038.54457.783374.782124@localhost.localdomain> Merriam, Rudyard writes: > Are there architecture documents for SAX or other parsers? Something I could > use if I wanted to roll my own in C++? Not yet -- unfortunately, you have to reverse-engineer them from the Javadoc. I'd like to fix that for ModSAX. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ken at bitsko.slc.ut.us Fri Feb 26 17:26:22 1999 From: ken at bitsko.slc.ut.us (Ken MacLeod) Date: Mon Jun 7 17:09:30 2004 Subject: events vs callbacks (was Re: SAX2 (was Re: DOM vs. SAX??? Nah. )) In-Reply-To: David Megginson's message of "Fri, 26 Feb 1999 06:53:04 -0500 (EST)" References: <002b01be60e0$dc774f60$c9a8a8c0@thing2> <14038.34584.490425.124743@localhost.localdomain> Message-ID: >>>>> "David" == David Megginson writes: > Bill la Forge writes: >> >Event-based programming existed before people started >> >encapsulating events in structures or objects. I'd define SAX >> as >an event-based API that reports events using callbacks. >> >> But why are we not taking advantage of having the events as >> objects? I've tried to second guess why this is so, but I >> think the arguments in favor of object-based events is >> stronger: the added overhead is balanced by greater simplicity >> and subsequently less overhead in other areas; the added >> flexability adding additional utility to all conformant code. David> From a design-pattern perspective, Bill is absolutely right: David> encapsulation and abstraction are big winners, and the way David> that he suggests is the proven method for building a robust David> system. David goes on to say that using event objects will increase the size of the implementation, increase the learning curve, and increase the burden of documentation. These statements seemed to be based on the _unstated_ assumption that unique event classes will be used for every type of event. Why is it necessary to use unique event classes for each type of event? Why not use a NamedNodeList or simple dictionary type that can hold the properties of the event? If a simple dictionary type is used, then the implementation will increase only by the size of one class (assuming a built-in class isn't used), the documentation can be written in terms of the XML Information Set (which XML application developers will need to be familiar with anyway), and would still provide all the advantages Bill la Forge is describing. I used this design pattern in writing the Perl binding to SAX and, although it hasn't been time-tested yet, the initial implementations show a lot of promise and flexibility. The draft for Perl SAX is available at: -- Ken MacLeod ken@bitsko.slc.ut.us xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From RDaniel at DATAFUSION.net Fri Feb 26 17:39:29 1999 From: RDaniel at DATAFUSION.net (Ron Daniel) Date: Mon Jun 7 17:09:30 2004 Subject: events vs callbacks (was Re: SAX2 (was Re: DOM vs. SAX??? Nah . )) Message-ID: <0D611E39F997D0119F9100A0C931315C2E0FD7@datafusionnt1> Dave Megginson said: [Several good things, ending with ...] > as long > as we're smart enough not to kill off SAX's original advantages in > round two. > [Ron Daniel] So now for a couple of comments from the peanut gallery. The first is the standard warning about second system syndrome. Its a syndrome because its a mistake that keeps getting made. So, while I was starting to get pretty interested in the discussion around what seem to be fairly large changes to SAX for version 2, Dave's sentence caused me to stop and ask myself what gripes I have with SAX now. I've only used SAX for four parsers, so I can't claim extensive experience. But from the experience I do have, I'm actually pretty happy with the overall structure. I particularly like the way I can use several document handlers to modularize the parsing of documents with complex DTDs. But what I find myself really wishing SAX had is a DTDHandler with methods like startElementDeclaration(), startAttlistDeclaration(), ... It should also have methods that pull apart the content model for me. Other features like control of parsing options (validating or not, control over namespace expansion, ...) will also be needed. But I don't currently feel a need to change the basics of the DocumentHandler interface. Later, Ron Daniel Jr. DATAFUSION, Inc. 139 Townsend Street, Suite 100 San Francisco, CA 94107 415.222.0100 fax 415.222.0150 rdaniel@datafusion.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From wunder at infoseek.com Fri Feb 26 17:51:39 1999 From: wunder at infoseek.com (Walter Underwood) Date: Mon Jun 7 17:09:30 2004 Subject: Content-Document-Type: was (Re: MIME types vs. DOCTYPE) In-Reply-To: <199902261322.AA03654@murata.apsdc.ksp.fujixerox.co.jp> Message-ID: <3.0.5.32.19990226094226.03c98770@corp> I feel that Content-document-type is a poor idea. It put something specific to XML in a generic header. Not all clients care about the doctype. Some (like search engines) don't need to render it. Others may just need to cache it. If a document is XHTML, and should have a default rendering, I'd call that a processing instruction: or whatever does the job. The objection about thin clients or palmtops not wanting to download large files doesn't really hold water. XML will generally be the smallest files. Mine are almost always smaller than the corresponding HTML. Powerpoint, PDF, JPEG -- those are big files. Adding an XML-specific HTTP header line makes HTTP 1.1 more complex (shudder), and imposes an extra coding and testing burden on HTTP implementations. Also, it does nothing for XHTML over other transports, like SMTP or FTP. Essentially, this is document information, not protocol information. It belongs in the document. To describe the document out-of-line, use RDF, not HTTP headers. Pragmatically, HTTP Content-type isn't even reliable. Somebody will decide that Excel and XML are the same thing, and start serving spreadsheets as text/xml. Cell phones have to deal with that world, and adding things to the HTTP spec doesn't fix ignorant sysadmins. And lots of web servers serve application-specific files (MS Word, Powerpoint, Excel) as application/octet-stream in order to force the browser to put up a save box rather than display them in the frame. We see this sort of stuff all the time with the search engine. XHTML Spec comment: the spec doesn't mention application/xml. It should. If application/xml is never appropriate for XHTML (say, the UTF-16 encoding is forbidden), then say so. XHTML Spec comment: Are the Strict, Transitional, and Frameset DTDs subsets or extensions? Or neither? Is one a subset of another? These intentions should be spelled out in the spec so that future versions won't break them. wunder -- Walter R. Underwood wunder@infoseek.com wunder@best.com (home) http://software.infoseek.com/cce/ (my product) http://www.best.com/~wunder/ 1-408-543-6946 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Fri Feb 26 18:49:40 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:09:30 2004 Subject: Well-formed vs. valid Message-ID: <87256724.0067440A.00@d53mta03h.boulder.ibm.com> >>FYI, our (IBM's) new version 2 architecture parsers do this. We have a >>pluggable architecture, and one of the plug ins is a validator. The low >>level scanner uses this to validate content before it sends it out through >>the internal even APIs. So, if you are wiring together a SAX style parser, >>you just wire the internal events to the SAX events and you have a >>validating SAX parser (actually we have that combination already provided >>for you as a canned parser, but you can do other variations as well.) > >Big question: can I plug someone else's SAX parser into your scanner, and >then have your validation component work on my SAX events? While it's >unlikely that I'd want to plug a different SAX parser in, it's quite >possible that I'd want to work with the SAX events (transforming with XT, >for instance) before performing validation. > You can, its just less efficient. The validators have to support 're-validation' or 'after the fact' validation, whatever you want to call it (e.g. revalidating a modified DOM tree.) Its just that, internally and in a DOM that we write for our parser specifically, we can take advantage of info that will significantly speed up the process. Once its passed through to the outside world (via some general API that cannot pass on our information) and hence only the element names exist, the validator has more work to do to do the validation, but it does work. For an event API, you will have to maintain an 'element stack' in order to gather up the info required to do the revalidation (a DOM tree already inherently represents that.) Its just a simple push down stack of the elements along the current nesting hierarchy, and the children of those elements. When you get to the end of an element, call the validator with the child list, then pop that top element off and go back to working on the previous one. The low level scanner (while parsing) maintains a stack like this for validation, though it only has to maintain numbers, not names. If you use our internal event API, you can also store numbers for revalidation (as would a DOM written specifically for our system.) As you can imagine, just doing number comparisons is much faster than doing hashed string comparisons. Of course that's not to say that you cannot maintain a string pool yourself and really only store numbers in your stack (for speed) and then just get the element name text references when its time to validate. But that's still not as fast as using our numbers, since they already exist and the validator knows the element content models in terms of those numbers. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Feb 26 19:01:59 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:31 2004 Subject: recursing root element References: <007701be618a$c02cdfa0$aaaddccf@ix.netcom.com> Message-ID: <36D6EF3E.CCD5FD76@locke.ccil.org> Frank Boumphrey wrote: > >From my reading of section 2.1 it isn't but the MSXML parser lets it stand, > so I'm probabably wrong. It's legal. 2.1 says not that there can be only one element with the same element type as the root element, but that there can be only one element in root position. In other words, these are illegal: -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ralph at fsc.fujitsu.com Fri Feb 26 19:55:59 1999 From: ralph at fsc.fujitsu.com (Ralph Ferris) Date: Mon Jun 7 17:09:31 2004 Subject: HyBrick Download Problem Fixed Message-ID: <3.0.5.32.19990226115346.00a52210@pophost.fsc.fujitsu.com> To the several people who reported problems with the latest HyBrick self-extracting file: Thanks for contacting me and my regrets on the inconvenience. The problem has now been fixed. Once again, the latest version of HyBrick, V.82, is available at: http://www.fsc.fujitsu.com/hybrick/ Best regards, Ralph E. Ferris Fujitsu Software Corporation xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Feb 26 20:08:30 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:31 2004 Subject: Comments on WD-html-in-xml-19990224 Message-ID: <36D6FEF2.9E92495E@locke.ccil.org> 1) I believe that the introduction of a media type "text/xhtml" is a mistake. Instead, it would be better to attach a media-type attribute specifying the formal public identifier of the DTD. This would be allowed on either "text/xml" or "text/html", by the appropriate IETF process. Here's an example: text/xml; dtd="-//W3C//DTD XTHML 1.0 Strict//EN" This would distinguish XHTML from HTML, different versions of HTML from each other, different DTDs of HTML from each other, and would continue to be useful in future for new XML document types. 2) The border attribute of the img element is declared %Length; in the Transitional DTD, but %Pixels; in the Frameset DTD. These are both aliases for CDATA, but post-XML validation may need to distinguish them. 3) There is no need to have fully separate DTDs for Frameset and Traditional; they can share using conditional sections, just as in HTML 4.0. 4) There are a variety of useless differences between the Frameset and Traditional DTDs, mostly involving whitespace. If the two DTDs are kept separate, these should be ironed out. 5) The LanguageCode parameter entity is defined as "NAME" in HTML 4.0, but "CDATA" in XHTML. The strictest equivalent of NAME in XML is NMTOKEN, which should be used. 6) SGML rules remove up to one line boundary at the start and/or the end of an element's content. Equivalently, up to one line boundary each is removed after a start-tag and before an end-tag. This rule affects the style, script, and pre elements, especially pre, and should be stated in the main document, as XML-based systems will have to emulate it. 7) In the comment preceding the OLStyle parameter entity: for "arablic" read "arabic". 8) Since XML is case-sensitive, OLStyle can be explicitly defined as "(1|a|A|i|I)". That would not work in HTML 4.0 because a and A, and i and I, would look the same to SGML. Consequently, LIStyle can be explicitly defined as "(%ULStyle;|%OLStyle;)" and the corresponding comment removed. 9) The content model of table does not match HTML 4.0, which requires the TBODY element but allows both start-tag and end-tag to be omitted. The precedent established by XHTML for the head and body elements is that such elements must appear explicitly. The table element, however, allows either tbody or tr+ after the optional thead and tfoot. This should be changed to just tbody. If it is decided not to do this (on HTML 3.2 compatibility grounds or otherwise), the design decision should be documented. I have not reviewed the XHTML Strict DTD at this time. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cowan at locke.ccil.org Fri Feb 26 20:25:51 1999 From: cowan at locke.ccil.org (John Cowan) Date: Mon Jun 7 17:09:31 2004 Subject: Are notations dead, or just pining for the fjords? (was Re: SAX and delayed entity loading) References: Message-ID: <36D702FB.EE81A590@locke.ccil.org> [This is an old e-mail that I wrote in early December, but for some reason never posted.] Liam R. E. Quin wrote: > There are several fairly big problems with notations as defined. > (1) the suggestion that one use the system identifier as a program to > run makes them a major security hole. Indeed. Wherefore that should not be done. Clause 4.7 of the XML Rec was worded poorly (too subtly): # [...] an external identifier for the notation which may allow an XML # processor or its client application to locate a helper application # capable of processing data in the given notation. That does not mean, in general, that the external ID should point directly to the helper application, since the application is inevitably system-dependent and the document should not be. (Tim, I've cc'ed you because I think making this clear(er) would be a useful addition to your XML annotations.) > (2) the idea that you know the format in advance of images or other > referenced objects and hard-wire it into your document does not fit > the web model of content negotiation, in which the client sends > a list of formats, in order of preference, and the server send > back the best available format, converting if necessary. This point and the following one apply to the use of notations with unparsed entities, which I am not now trying to defend. Notations can be applied to elements as well through NOTATION attributes, and that use I believe to be valuable. > (4) there is no way to give a notation for XML, since, by definition, > any external entity with an associated notation is an unparsed entity! > The distinction between parsed/unparsed should be nothing to do with > the format at all. No, there is no way to give a notation for XML and expect it to be parsed automatically --- that's not the same thing at all. If you want to reference a subdocument and do *not* want it parsed, then XML unparsed entities are plausible. An example would be an XML TOC, which can be validated, rendered, etc. without incorporating the individual chapters (also in XML) directly into it, as a parsed entity would do. > The word "entity" confuses many people who come to XML (and SGML!) for > the first time. For one thing, it's already used in the relation > database world to mean something entirely different. Terminological buccaneering is unavoidable. -- Northrop Frye > For another > thing, XML has at least five meanings for the word entity -- and even > the XML specification doesn't always say which kind it means at any > given point. No, there are five kinds of entities. (It's true that CDATA has two meanings, however, which is regrettable; in SGML it has even more.) > Frankly, the term "file" would be better for an external entity. An external entity can be the result of a query as well, notably if its system id contains a "?". -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Michael.Orr at Design-Intelligence.com Fri Feb 26 22:19:31 1999 From: Michael.Orr at Design-Intelligence.com (Michael.Orr@Design-Intelligence.com) Date: Mon Jun 7 17:09:31 2004 Subject: Streams, protocols, documents and fragments Message-ID: > -----Original Message----- > From: keshlam@us.ibm.com [mailto:keshlam@us.ibm.com] > Subject: RE: Streams, protocols, documents and fragments > > Personal opinion: The right way out of the "never-ending document" problem > is to declare that the stream is a stream of transaction documents, NOT a > single huge document in its own right. Exactly. In many cases -- I suspect the vast majority -- the XML DTD or schema, parse bite size, and transaction considerations all map together very effectively at a relatively fine granularity. The remaining questions, at the level of the containing interchange stream(s), can be approached as protocol design and separated by layering from the document considerations. Using document modeling to describe the stream would simultaneously overkill the structure side and fail to engage with protocol state issues. This is not to deny the requirement for document scalability -- there are very good reasons to make sure that XML tools are prepared to cope with huge documents. But: a huge document is not a good implementation for a transaction stream. Regards, Mike ---------------------------------------- Michael Orr, CTO, VP R&D Design Intelligence Inc, Seattle WA USA http://www.design-intelligence.com pager:888-688-4609 fax:206-343-7750 xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Fri Feb 26 23:49:05 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:09:31 2004 Subject: Streams, protocols, documents and fragments In-Reply-To: Message-ID: Hi I thought this would be of interest to the people in the XML Streams, protocols, documents and fragments thread. Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com ---------------- Forwarded Message ---------------------- Date: Fri, 26 Feb 1999 10:30:26 PST From: Chris Newman Subject: 44th IETF-MINNEAPOLIS, MN: APPLCORE To: ietf-applcore@imc.org Welcome to the ietf-applcore mailing list. We now have a tentative BOF slot in Minneapolis: APPLCORE has tentatively been scheduled as follows: Monday, March 15 at 1300-1500 other groups scheduled at that time: ion, rmonmib, idr, ipsec, megaco, ippm, uswg ----- I'm working on a detailed agenda for the BOF feel free to post suggested topics. Here's the proposed charter I drafted, which will be discussed at the BOF: ----- Application core protocol WG (APPLCORE) The IETF has traditionally developed application protocols directly on top of a raw TCP stream. However, there is a growing set of problems which many application protocols have to solve regardless of what the protocols do. This WG will identify the common problems that deployed IETF protocols have solved, identify the successes and failures that deployed IETF protocols made when addressing these problems and design a simple core protocol to address these problems. This core protocol may then be used by future application protocols to simplify both the process of protocol design and the complexity of implementing multi-protocol clients or servers. In order to keep the WG in focus, the following items are explicitly out-of-scope: * Backwards compatibility with existing application protocols Backwards compatibility often compromises correct design. If this WG is successful it will impact a great number of future protocols, and thus the design errors which backwards compatibility might dictate must be avoided. * Transport layers other than TCP/IP This has been a rathole in too many other WGs. * Protocol models outside the traditional IETF client-server TCP application protocol model. The IETF doesn't have sufficient past experience in these areas. * New features If a problem hasn't been solved in at least two deployed IETF application protocols, then it is out-of-scope for the base core protocol spec. This does not preclude individuals or other groups from doing extensions to the core protocol which might be used by multiple future application protocols; it just limits the scope of the core spec. * Normative references to other application protocols or non-public specs The core protocol has to stand by itself. It may reference protocol building blocks that have been used by several other application protocols such as ABNF, language tags, UTF-8, domain names, URLs, MIME, SASL, GSSAPI and TLS. It must avoid normative references to full application protocols such as ACAP, HTTP, IMAP, LDAP, and SMTP. It must avoid normative references to any document which is not freely and publicly available on the Internet. The WG will produce the following output: * An RFC documenting the problems identified to solve, and giving examples of existing deployed IETF protocols which succeeded or made mistakes when solving those problems. A starting list of problems for the WG to discuss (the WG may choose not to address some of these) follows: * connection user authentication and privacy (e.g., SASL and STARTTLS) * server capability/extension announcement (e.g., SMTP EHLO) * extensible command/response syntax and structure * error status tokens and human readable error text issues * syntax for transfer of large (multi-line) objects (e.g., dot-stuffing, length counting, chunking) * multiple commands in progress at the same time (command ids or tags) * unsolicited server messages * command pipelining (sending multiple commands without waiting for responses) * Structured data representation (e.g., RFC 822-style AV pairs, IMAP s-expressions, LDAP ASN.1, XDR, etc) in command/response syntax. * low bandwidth support (e.g., compression layer or packed binary protocol encoding) * connection shutdown (QUIT/LOGOUT command) * A simplicity litmus test to determine if a proposal is acceptably simple. The initial litmus test will be: core protocol spec is less than 25 pages. * A standards track core application protocol specification which uses the lessons learned from the informational document and fits the litmus test above. An open source implementation of the complete core protocol must exist prior to IETF last call. The problem identification draft (above) must be completed prior to IETF last call. The WG may solicit strawmen for the core application protocol from multiple document editors and select the one which is technically best and fits this charter. The WG may choose to do additional standards track documents which extend the core protocol as long as they are not new features by the above definition. The WG may choose to do one or more APIs for using the core protocol and adding commands/extensions to it. These might be informational or standards track as deemed appropriate. ---------------- End of Forwarded Message --------------- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From roddey at us.ibm.com Sat Feb 27 00:05:08 1999 From: roddey at us.ibm.com (roddey@us.ibm.com) Date: Mon Jun 7 17:09:31 2004 Subject: Yet another niggling XML syntax question Message-ID: <87256725.0000562C.00@d53mta03h.boulder.ibm.com> Does the following violate the 'partial markup in entity' rule of XML? "> %Whole; By the time the entity Whole is actually used, the content that it contains has been fully conglomerated into a single entity, therefore it does contain the whole markup inside it, though the parts that made up the entity itself do not. I sincerely hope that this is not illegal because it would involve never conglomerating the actual text of such built up PEs, because only then would the parser find out what happened. So I'm assuming that this is ok, that the prohibition against partial markup refers to the eventual use of the entity, not to the definition thereof? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ralph at fsc.fujitsu.com Sat Feb 27 02:32:03 1999 From: ralph at fsc.fujitsu.com (Ralph Ferris) Date: Mon Jun 7 17:09:31 2004 Subject: Another Step Toward XML Content Message-ID: <3.0.5.32.19990226183011.009c07c0@pophost.fsc.fujitsu.com> In yesterdays announcement (see ANNOUNCE: "HyBrick" Browser, V0.82) I said that a "Document Group" sub-menu has been added in the "XLink/XPointer" menu. I've now installed some files on our Web server that help demonstrate the significance of this feature (that is, in addition to the "Topic Map" sample provided with the distribution). The "content" file is James Clark's paper on "XML Namespaces." This paper is identical in content to James' original paper. James has been kind enough to let me use it for the demo. The only modifications are to the links. Running the Demo If your network connections allow it (and again I have to emphasize that HyBrick doesn't yet support proxy servers, so this demo work for everyone), download the file by pointing HyBrick at: http://www.fsc.fujitsu.com/hybrick/html/xmlns.xml If all goes well, the file will come up with two areas highlighted in blue: - the title "XML Namespaces" - the phrase "XML Namespaces Recommendation" in the first sentence Click on either of these. A dialog box will then appear, asking you for the URL of the style sheet to use. Enter: http://www.fsc.fujitsu.com/hybrick/html/styles/HTML32.dsl Again, if all goes well, the file will start to be displayed, but an error box will appear. Minus the preamble, these error messages will say: E: invalid value for "entity-system-id" characteristic E: 1st argument for primitive "notation-system-id" of wrong type: "#f" not a string Click OK. You will see the W3C "Namespaces in XML" recommendation paper displayed. The W3C's logo won't appear though. Why? That's what the error messages were all about. HTML is being processed as XML (using a DSSSL style sheet and an SGML "engine"!). That means NOTATION and ENTITY declarations are needed for the logo graphic. Of course, these aren't in the doc, hence the error messages. Notice that this doc doesn't contain any back links to the original document. Also, "HyBrick" doesn't provide a "back" button. So, short of starting all over again, how do you get back to the original doc (James' paper)? Answer: - Right click *anywhere* in the document. A menu will appear with the choices "ShowXPointer" and "Document Group." - Select "Document Group" A menu will appear with three choices: 1) xmlns.xml 2) Full URL to xmlns.xml 3) Full URL of the current document Selecting either 1) or 2) will return you to the original document. The observant will find a number of issues here. For example, the links in the XML Namespaces Recommendation aren't "live." (Reason: the HREF attributes are in uppercase.) I'm sure others will occur to people. These issues can be pursued in future discussion. Meanwhile, I hope a significant number of people will be able to run the demo and see it as another step toward bringing XML content to the Web. Best regards, Ralph E. Ferris Fujitsu Software Corporation xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at goon.stg.brown.edu Sat Feb 27 03:11:03 1999 From: richard at goon.stg.brown.edu (Richard Goerwitz) Date: Mon Jun 7 17:09:31 2004 Subject: Yet another niggling XML syntax question References: <87256725.0000562C.00@d53mta03h.boulder.ibm.com> Message-ID: <36D761F7.782F39E7@goon.stg.brown.edu> roddey@us.ibm.com wrote: > Does the following violate the 'partial markup in entity' rule of XML? > > > "> > > %Whole; The sentence you are looking for in the standard occurs in section 4.3.2: All internal parameter entities are well-formed by definition. As you gathered, you need the trailing space after ! "> %Whole; ]> -- Richard Goerwitz PGP key fingerprint: C1 3E F4 23 7C 33 51 8D 3B 88 53 57 56 0D 38 A0 For more info (mail, phone, fax no.): finger richard@goon.stg.brown.edu xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tomh at thinlink.com Sat Feb 27 16:50:42 1999 From: tomh at thinlink.com (Tom Harding) Date: Mon Jun 7 17:09:31 2004 Subject: Streaming XML and SAX References: <4.0.1.19990223210727.00e59d50@pop.hesketh.net> <14036.1186.399749.89131@localhost.localdomain> <36D46419.73F63780@thinlink.com> <14036.28216.379328.364771@localhost.localdomain> Message-ID: <36D82244.DB014ECE@thinlink.com> David Megginson wrote: > Tom Harding writes: > > > What bit sequence would you use as a separator and how would you > > ensure that no conceivable encoding would produce it spuriously? > > I'm talking about characters, not bit sequences. For a simple > solution, you should provide the entire stream in the same character > encoding (remember that a transport protocol is allowed to override > the encoding in the XML declaration or encoding declaration). > Otherwise, the packets will need to be escaped somehow. How? You would doubtless agree that mandating a specific encoding for all streams sidesteps one of the major benefits of XML. Introducing an encoding declaration mechanism into the transport protocol, as HTTP does, would duplicate the function of the XML processor. But if you have a specific solution that I've missed then I'm sure I can appreciate it. > No, it still looks like a messy architecture to me, because the > transport layer has to know about the packets -- it has to parse the > XML about to get information about what it's looking at, and that adds > complexity and inefficiency. A clean architecture should separate the > layers completely, and use XML only where it has an obvious advantage > over other approaches. It's amazing how two people can see things so differently. I think it's supremely elegant that only the XML processor needs to look at data coming off the wire. It's also as efficient as it gets. Of course the software architecture that handles the documents emitted must be modular and extensible, but the task of parsing is done. Tom Harding xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From david at megginson.com Sat Feb 27 19:03:58 1999 From: david at megginson.com (David Megginson) Date: Mon Jun 7 17:09:31 2004 Subject: Streaming XML and SAX In-Reply-To: <36D82244.DB014ECE@thinlink.com> References: <4.0.1.19990223210727.00e59d50@pop.hesketh.net> <14036.1186.399749.89131@localhost.localdomain> <36D46419.73F63780@thinlink.com> <14036.28216.379328.364771@localhost.localdomain> <36D82244.DB014ECE@thinlink.com> Message-ID: <14040.15519.978518.333845@localhost.localdomain> Tom Harding writes: > How? You would doubtless agree that mandating a specific encoding > for all streams sidesteps one of the major benefits of XML. > Introducing an encoding declaration mechanism into the transport > protocol, as HTTP does, would duplicate the function of the XML > processor. Here's a short excerpt from the non-normative Appendix F of the XML 1.0 Recommendation: The second possible case occurs when the XML entity is accompanied by encoding information, as in some file systems and some network protocols. When multiple sources of information are available, their relative priority and the preferred method of handling conflict should be specified as part of the higher-level protocol used to deliver XML. Rules for the relative priority of the internal label and the MIME-type label in an external header, for example, should be part of the RFC document defining the text/xml and application/xml MIME types. In the interests of interoperability, however, the following rules are recommended. - If an XML entity is in a file, the Byte-Order Mark and encoding-declaration PI are used (if present) to determine the character encoding. All other heuristics and sources of information are solely for error recovery. - If an XML entity is delivered with a MIME type of text/xml, then the charset parameter on the MIME type determines the character encoding method; all other heuristics and sources of information are solely for error recovery. - If an XML entity is delivered with a MIME type of application/xml, then the Byte-Order Mark and encoding-declaration PI are used (if present) to determine the character encoding. All other heuristics and sources of information are solely for error recovery. These rules apply only in the absence of protocol-level documentation; in particular, when the MIME types text/xml and application/xml are defined, the recommendations of the relevant RFC will supersede these rules. If I were defining a streaming protocol for e-commerce, news, financial markets, etc., I probably would mandate a single encoding for all packets (UTF-8 or UTF-16), just to keep things simple. As you can see in the above excerpt, the character-set discover heuristics in XML are intended for use only in the absence of protocol-specific encoding information. > It's amazing how two people can see things so differently. I think > it's supremely elegant that only the XML processor needs to look at > data coming off the wire. It's also as efficient as it gets. It is efficient only if you know for certain that you need to use a single object model for all of the XML information that you're receiving; otherwise, you'll end up building a generic object model (like a DOM), then tearing it down to build an optimised domain-specific one (such as a vector graphic or a financial-transaction object), and that process would be painful. > course the software architecture that handles the documents emitted > must be modular and extensible, but the task of parsing is done. Parsing is relatively easy (though it's wasteful to do it twice); building an object model from the parsing is time- and resource-consuming. For example, imagine that I have a Java class like this: public class Purchase { public int seqno; public int customerId; public int vendorId; public int invoiceId; public float total; } In XML, an instance of this information might look like this: 12345678 87654321 18273645 81726354 92674.12 Based on my (limited) understanding of the Java VM, the Java versions of a Purchase objects will require 24 bytes of storage each; I'd guess that even a heavily-optimised generic DOM implementation would require at least 5-10 times as much storage (I'll welcome corrections from any DOM implementors on this list). In other words, if I go straight from the XML to my own object model, I can store 100,000 purchases in 2,400,000 bytes of storage; if I go from XML to a generic DOM object model, I will require between 12,000,000 and 24,000,000 (or more) bytes to store the same information, and then I will *still* have to build my own object model afterwards. All the best, David -- David Megginson david@megginson.com http://www.megginson.com/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From nate at valleytel.net Sat Feb 27 22:37:24 1999 From: nate at valleytel.net (Nathan Kurz) Date: Mon Jun 7 17:09:31 2004 Subject: Streaming XML and SAX Message-ID: <199902272237.QAA10846@trinkpad.valleytel.net> David Megginson writes: > No, it still looks like a messy architecture to me, because the > transport layer has to know about the packets -- it has to parse the > XML about to get information about what it's looking at, and that adds > complexity and inefficiency. A clean architecture should separate the > layers completely, and use XML only where it has an obvious advantage > over other approaches. > > Tom Harding replies: > > It's amazing how two people can see things so differently. I think > > it's supremely elegant that only the XML processor needs to look at > > data coming off the wire. It's also as efficient as it gets. > > David Megginson counters: > It is efficient only if you know for certain that you need to use a > single object model for all of the XML information that you're > receiving; otherwise, you'll end up building a generic object model > (like a DOM), then tearing it down to build an optimised > domain-specific one (such as a vector graphic or a > financial-transaction object), and that process would be painful. Building a DOM everytime is inefficient, but I have to agree with Tom that having XML act as the protocol as well is quite elegant. Why presume that the XML processor capable of handling the protocol layer would have to build a _generic_ object model? And why presume that an XML processor has to build a _single_ object from all the information? > > 12345678 > 87654321 > 18273645 > 81726354 > 92674.12 > It seems like parsers could be made a whole lot more configurable than they currently are. If more configurable, the top level XML processor could build the domain-specific objects itself. Continuing with your example, I can envision a processing model like this: Parser sees: Checks: Is a 'purchase' parser registered? Yes: Pass control to it, 'purchase' parser reads until , then returns control to top level parser. or Yes: Slurp text until , pass "..." (unparsed) to a 'purchase' parser running under another thread or Yes: Slurp text until and store it (unparsed) in the DOM to be handled on a later pass. No: keep parsing text and adding nodes to the DOM. or No: Throw away text (unparsed) up until It would then be up to the subparser to build its own objects which could be used later. Or the subparser could return an already processed node to be inserted into the generic object model (or DOM). Is this model possible with any existing parsers? > Parsing is relatively easy (though it's wasteful to do it twice); > building an object model from the parsing is time- and > resource-consuming. Building the object model is probably the more expensive part, but in many cases multiple selective parsing passes (skimming) would be more efficient than parsing everything completely the first time through. It seems that all current parsers assume that their duty is always to create a faithful model of all of the entire document they are presented with, and thus parse the entire document in a single pass with a single thread of control. Why this assumption? nathan kurz nate@valleytel.net xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tomh at thinlink.com Sun Feb 28 03:29:32 1999 From: tomh at thinlink.com (Tom Harding) Date: Mon Jun 7 17:09:31 2004 Subject: Streaming XML and SAX References: <4.0.1.19990223210727.00e59d50@pop.hesketh.net> <14036.1186.399749.89131@localhost.localdomain> <36D46419.73F63780@thinlink.com> <14036.28216.379328.364771@localhost.localdomain> <36D82244.DB014ECE@thinlink.com> <14040.15519.978518.333845@localhost.localdomain> Message-ID: <36D8B7E9.66C87AE0@thinlink.com> David Megginson wrote: > ...As you > can see in the above excerpt, the character-set discover heuristics in > XML are intended for use only in the absence of protocol-specific > encoding information. I suspect those lengthy notes were written to explain exactly how developers were to reconcile the fact that an external way of declaring the encoding already existed in HTTP, which it would have been rather unkind to ignore. Tim Bray's annotations to the spec seem to confirm this. But since we're designing a protocol independent of HTTP, we ought to let the XML encoding declaration do its job. > For example, imagine that I have a Java class > like this: > > public class Purchase { > public int seqno; > public int customerId; > public int vendorId; > public int invoiceId; > public float total; > } > > In XML, an instance of this information might look like this: > > > 12345678 > 87654321 > 18273645 > 81726354 > 92674.12 > > > Based on my (limited) understanding of the Java VM, the Java versions > of a Purchase objects will require 24 bytes of storage each; I'd guess > that even a heavily-optimised generic DOM implementation would require > at least 5-10 times as much storage (I'll welcome corrections from any > DOM implementors on this list). > > In other words, if I go straight from the XML to my own object model, > I can store 100,000 purchases in 2,400,000 bytes of storage; if I go > from XML to a generic DOM object model, I will require between > 12,000,000 and 24,000,000 (or more) bytes to store the same > information, and then I will *still* have to build my own object model > afterwards. Multiplying your numbers by 100,000 is a little gratuitous, since it would be lousy application design to force all 100,000 objects to be stored in DOM format at the same time (say, by cramming them all into some super-document). I will be the first to admit that it takes resources to parse XML out to a standard memory representation, but I see no reason why those resources shouldn't be in line with the work accomplished, which is mostly converting markup to memory structures. And actually, you should be comparing it with the storage required by unparsed XML, not your application object. That's how you would need to store it if you chopped up the stream into chunks to be passed off to separate threads or boxes as you suggest. Tom Harding xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From martind at netfolder.com Sun Feb 28 17:04:10 1999 From: martind at netfolder.com (Didier PH Martin) Date: Mon Jun 7 17:09:32 2004 Subject: Streaming XML and SAX In-Reply-To: <199902272237.QAA10846@trinkpad.valleytel.net> Message-ID: Hi Nathan, Building a DOM everytime is inefficient, but I have to agree with Tom that having XML act as the protocol as well is quite elegant. Why presume that the XML processor capable of handling the protocol layer would have to build a _generic_ object model? And why presume that an XML processor has to build a _single_ object from all the information? > > 12345678 > 87654321 > 18273645 > 81726354 > 92674.12 > It seems like parsers could be made a whole lot more configurable than they currently are. If more configurable, the top level XML processor could build the domain-specific objects itself. Continuing with your example, I can envision a processing model like this: Parser sees: Checks: Is a 'purchase' parser registered? Yes: Pass control to it, 'purchase' parser reads until , then returns control to top level parser. or Yes: Slurp text until , pass "..." (unparsed) to a 'purchase' parser running under another thread or Yes: Slurp text until and store it (unparsed) in the DOM to be handled on a later pass. No: keep parsing text and adding nodes to the DOM. or No: Throw away text (unparsed) up until It would then be up to the subparser to build its own objects which could be used later. Or the subparser could return an already processed node to be inserted into the generic object model (or DOM). Is this model possible with any existing parsers? This architecture brings more work than required. An other way to do it would be (in fact we are already doing that with our DSSSL,XSL interpreters). a) parse the document or the stream b) a interpreter router check for certain Gi or Pi. On matching one, load the appropriate interpreter c) the interpreter interprets parsed GIs until the end of the document (in your example: ) d) When the end of the document is reached, the router goes back to listen mode for this multiplexed channel (a channel is a multiplexed stream within a session) and the interpreter is unloaded For document based parsing, as usual, we use file protocols. For streaming parsing, we are using HTTP-NG or MEMUX techniques. MEMUX is a work in progress but basically, this is multiplexing on a single session. Because, this protocol level takes care of the multiplexing, the parser do not have to care about mixing streams and its universe is only a single stream with documents organized in strict sequence. In a multiplexed stream all documents are in a row and follow a strict sequence. However, globally, on a single session, several documents are sent simultaneously. Thus, this architecture has several layers: interpreters ----------- Interpreter router ----------- SGML/XML parser ----------- MEMUX ----------- Transports For file based or blob based documents, replace the first two layers by the file protocol (file, http,ftp. etc...) A SGML/XML document without an interpreter is like a sleeping beauty :-). To transform a XML document into something useful, you not only have to parse the it but also to interpreter what you will do with each GI. Actually, because MEMUX is still a moving target, we implemented our own version of it until we get a consensus around a new spec which should be the conclusion of the newly created IETF MEMUX workgroup. Building the object model is probably the more expensive part, but in many cases multiple selective parsing passes (skimming) would be more efficient than parsing everything completely the first time through. It seems that all current parsers assume that their duty is always to create a faithful model of all of the entire document they are presented with, and thus parse the entire document in a single pass with a single thread of control. Why this assumption? Not all parsers make this assumption :-) in our case, our parser either do event based processing or build a grove or a DOM. In fact, for DOM like interface, we prefer a new model we internally use which is based on generalized property sets. This kind of interface can deal with either directory service objects or document objects. We merged both world because, when you look at these thought the perspective of property sets, both are very similar. Then, with property set based model, an interpreter support an interface based on the composite pattern (ref: "Patterns" - Gamma & al.). It can _do_ something either with directory service objects, relational database rows or document elements. This abstraction set apart the interpretation and the parsing operations. What is a property set based API then? Imagine this: A hierarchy of objects and each object has a property set attached to it. An object can contain other objects (i.e. the composite pattern). thus, if each object is a collection of objects and that each member of the collection is classified with an associative array (i.e. a map, B+ tree, etc..) therefore if an object can contain an other object, you obtain a tree. A) the object has members to manipulate the objects collection and has collection manipulation members like: add remove update get/find get enumerator B) a property set is also a collection and the property set interface has the same members: add remove update get/find get enumerator c) an enumerator can be implemented as: next previous ResetTo Thus, if this is implemented with objects languages or object middleware (java, DCOM, CORBA, ILU, etc...), an interpreter has just to get/find the object, enumerate its content and for each object get its properties. In the case of a document object then one of the properties is the GI content. For instance, to get a property from the GI we call property->Get("Content", Content) or with an interpreted language: Content = Get("Content"). Remark that we don't need with a composite pattern interface to know in advance all properties names nor do we have to know all object's type. Therefore, the interface is more lightweight and we don't have to create a new interface with each new object. The interface is general enough to process a lot of whole-part structures as long as each member can be associated with a name. If we don't need a property set based interface because of memory footprint or other constrains, the interpreter is event based and implements an object event handler which receives a property set enumerator as parameter like: On_Object(PROPERTYSETENUM propertySetEnum) { for (.....){ PropertySetEnum.next...... } } The interpreter just enumerate the property set and do something on it. Because the interpreter knows only certain keywords, it will process only these keywords. But, all interpreters being event based, in this case, use the same interface with parsers or something else like a directory service,etc... To replicate DSSSL or XSL mechanisms, each even handler can have the GI name. For instance, On_Vendor-id that correspond to , On_Customer to etc... This way, each even handler can process property sets differently based on each event handler. this last mechanism replicates in a certain way the pattern match mechanism found with style languages or transformation languages. Thus, to process your document, we would have: function On_purchase() { // enumerate all properties with enumerator // and _do_ something } function On_Customer() { } etc... So, not all architectures are primitive :-). Regards Didier PH Martin mailto:martind@netfolder.com http://www.netfolder.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mintert at irb.informatik.uni-dortmund.de Sun Feb 28 17:25:45 1999 From: mintert at irb.informatik.uni-dortmund.de (Stefan Mintert) Date: Mon Jun 7 17:09:32 2004 Subject: W3C spec.dtd Message-ID: <199902281725.SAA02098@brown.informatik.uni-dortmund.de> Hi, there are some technical reports on W3Cs website that are marked up according to a DTD called 'spec.dtd' (that's the system identifier). Is this DTD publically available? Does a DSSSL style sheet exist, that renders instances of this DTD to HTML? I couldn't find answers to this question on W3Cs site. Thanks in advance! Bye, Stefan. +-----------------------------------------------------------+ Stefan Mintert UniDo: mintert@irb.informatik.uni-dortmund.de private: stefan@mintert.com +-----------------------------------------------------------+ "let the music keep our spirits high..." (Jackson Browne) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From MikeDacon at aol.com Sun Feb 28 18:29:02 1999 From: MikeDacon at aol.com (MikeDacon@aol.com) Date: Mon Jun 7 17:09:32 2004 Subject: XLINK role attribute for simple links? Message-ID: <81566e8d.36d98a90@aol.com> Hi Everyone, I am working my way through the XLink standard (WD 3-Mar-98) and am now confused about the role attribute. Under Link semantics the standard you can specify the role attribute to describe the type of relationship the link describes. However, it caveats this saying that simple links "have an attribute called role that has a different function, they cannot have a role attribute for link semantics." Unfortunately, I have not been able to find in the specification the discussion of the role attribute for a simple link. Unless the standard is talking about the "content-role" attribute but that of course is a different attribute. Also, why use the attribute "role" for link semantics and "role" for Remote resource semantics but then shift gears to "content-role" for local resource semantics? Wouldn't the role of a remote resource also be a "content-role"? One last comment, anyone else get the feeling that there must be a better way to describe the combinations of legal attributes for the various types of links? URLs to tutorials or good papers on this would be appreciated. Thanks, - Mike ----------------------------------------------- Michael C. Daconta Author of Java 2 and JavaScript for C/C++ Programmers Author of C++ Pointers and Dynamic Memory Management Sun Certified Java Programmer and Developer http://www.gosynergy.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From b.laforge at jxml.com Sun Feb 28 18:37:12 1999 From: b.laforge at jxml.com (Bill la Forge) Date: Mon Jun 7 17:09:32 2004 Subject: Streaming XML and SAX Message-ID: <004901be6348$9c8af4a0$c9a8a8c0@thing2> From: Didier PH Martin >This architecture brings more work than required. An other way to do it >would be (in fact we are already doing that with our DSSSL,XSL >interpreters). > >a) parse the document or the stream >b) a interpreter router check for certain Gi or Pi. On matching one, load >the appropriate interpreter >c) the interpreter interprets parsed GIs until the end of the document (in >your example: ) >d) When the end of the document is reached, the router goes back to listen >mode for this multiplexed channel (a channel is a multiplexed stream within >a session) and the interpreter is unloaded It seems like something is backwards here! If an application is processing a series of documents, once it has a universal type name for that document (root element name + namespace), it knows how it wants to process the document and doesn't need a Pi. (What's a Gi? Is that XML?) Also, you should be able to use the same parser for all document types and then do the routing on the parse events, saving you from having to do a "pre-parse" to determine the universal type name. Bill xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From paul at prescod.net Sun Feb 28 20:04:22 1999 From: paul at prescod.net (Paul Prescod) Date: Mon Jun 7 17:09:32 2004 Subject: W3C-transformation language petition Message-ID: <36D96F14.FDED8DA1@prescod.net> Please submit your opinion on the following to: http://www.prescod.net/xsl/petition/ I would like to get a large sample of the XML/SGML-using community. Readers may also redistribute this and repost it where ever they feel it is appropriate. ---- A proposal for the creation of a W3C-recommended transformation language XSL does not require result trees to use the formatting vocabulary and thus can be used for general XML transformations. - the XSL specification Although it is billed as a stylesheet language, the Extensible Stylesheet Language fits the definition of a transformation language. As described above, XSL can be used for XML to XML transformations. In fact, most XSL implementations only support the transformational feature. This situation is very confusing for many people. Many potential users want to know exactly what XSL really is. XSL has this dual nature because of the organization of its specification. The transformation part of the language is separate from the part that is specific to stylesheet application. They are equally important but they are separate. In effect XSL describes essentially two languages, not one. The first is a transformation language and the second is a formatting object vocabulary that should be implemented by all renderers. There are many people who have legitimate reasons to use the transformation part independently of the style part. We expect the transformation language to become an important enabler of electronic commerce, electronic data interchange and metadata interchange. We do not feel that the engines that support this interchange should be considered incomplete XSL engines. Instead we would like the transformational part of XSL to be specified as a separate entity. We believe that this would strengthen both the transformational and formatting parts of what we now call XSL. The transformation language could be stronger because conformance could be formally specified and tested in areas unrelated to stylesheet application. If it is appropriately named then people looking for a transformation language would more easily be able to find it. This transformation language could also be included by reference into other specifications unrelated to style. XSL would essentially become the application of the transformation language to input documents where the result is required to conform to a formatting object vocabulary. The XSL specification would reference the transformation language specification and define formatting objects. Those using XSL for style application could do so in exactly the same manner that they do today. The formatting part of XSL would also be strengthened by the fact that conformance testing would be clearer and simpler. These formatting objects could also be referenced by other specifications (e.g. CSS3) and even used on their own as a formatting-based word processor interchange language. Due to the current organization of XSL there are many "XSL implementations" that have nothing to do with formatting. Currently there is nothing the W3C can do to discourage these "half-implementations" without also discouraging the use of the transformation language as a basis for electronic commerce and data interchange. This muddy definition of "XSL implementation" is dangerous. Even when the XSL specification is complete, a browser vendor could conceivably implement only the transformational part of XSL. When challenged, they could point to these other half-implementations as evidence that this was a valid choice to make. If the languages were separate then it would become clear which browsers truly implemented XSL and which only implemented the transformation language. In fact, the W3C could use its copyright to prevent the name XSL from being applied to implementations that do not support the formatting objects. We believe that this branding would be a very effective tool in defending the true purpose of XSL: interoperable stylesheets. We do not propose any particular organization of the specification or specifications. Our requirements are functional: 1. It should be possible to implement only the transformation language and have the implementation conform to some W3C named, formally-defined language. 2. It must not be possible to create an implementation of a W3C-defined language called XSL unless the implementation supports formatting objects. The signatories to this document do not herein propose any change to the specification-making process. Opinions vary widely on the best way to create technical specifications. We do all agree that it is the user community's right to complain when technology creators do not meet their needs. We invoke that right in the issue of the XSL language. ---- Please submit your opinion to: http://www.prescod.net/xsl/petition/ -- Paul Prescod - ISOGEN Consulting Engineer speaking for only himself http://itrc.uwaterloo.ca/~papresco "The Excursion [Sport Utility Vehicle] is so large that it will come equipped with adjustable pedals to fit smaller drivers and sensor devices that warn the driver when he or she is about to back into a Toyota or some other object." -- Dallas Morning News xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jarle.stabell at dokpro.uio.no Sun Feb 28 20:55:49 1999 From: jarle.stabell at dokpro.uio.no (Jarle Stabell) Date: Mon Jun 7 17:09:32 2004 Subject: W3C spec.dtd Message-ID: <01BE6366.0E5EF230.jarle.stabell@dokpro.uio.no> There's a very nice document at: http://www.w3.org/XML/1998/06/xmlspec-report-19980910.htm Cheers, Jarle Stabell On 28. februar 1999 18:26, Stefan Mintert [SMTP:mintert@irb.informatik.uni-dortmund.de] wrote: > > > Hi, > > there are some technical reports on W3Cs website that are marked up according > to a DTD called 'spec.dtd' (that's the system identifier). Is this DTD > publically available? xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From bmhughes at ozemail.com.au Sun Feb 28 22:28:01 1999 From: bmhughes at ozemail.com.au (Baden Hughes) Date: Mon Jun 7 17:09:32 2004 Subject: XML and special Characters : unicode v3.0 ? Message-ID: <000301be6361$272d2480$5ffa6ccb@baden> Hi, I know that XML 1.0 allows you to use 'special' characters as included in the Unicode 2.0 specification. With the upcoming release of Unicode 3.0 how will we be able to refer to characters in 3.0 which were not in 2.0 ? The same way (meaning the actual version of Unicode spec is irrelevant as long as the method used is included in XML) or some new way ? For instance, the Sinhala character set was not in Unicode 2.0 but will be in 3.0. How do I get one of those characters in an XML document ? Or is that inconsequential to the document per se as it is simply a reference and its really up to the application to render it correctly ? Baden xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1 To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)