title

From andrewl at microsoft.com Wed Oct 1 00:12:51 1997 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 16:58:31 2004 Subject: revised Animal-friends implemented as a pattern (Re: XML-Data :advantages over DTD syntax?) Message-ID: <7BB61B44F197D011892800805FD4F79201800635@RED-03-MSG.dns.microsoft.com> I have been staying out of this discussion, despite its interest to me, in order to concentrate on the vital issue of namespaces. However, I assure you that there was no attempt in XML-Data to make "any" differ from its current use in DTDs. If it has flaws in that area, I am interested in fixing them. --Andrew Layman AndrewL@microsoft.com > -----Original Message----- > From: Rick Jelliffe [SMTP:ricko@allette.com.au] > Sent: Monday, September 29, 1997 9:51 PM > To: Jonathan Robie > Cc: xml-dev@ic.ac.uk > Subject: Re: revised Animal-friends implemented as a pattern (Re: > XML-Data:advantages over DTD syntax?) > > Someone has pointed out that the colonized syntax would be > approporiate and clearer. Here it is again (sorry!) with > colons. (I have also cleaned up the inheritance to bundle > things more, so please delete previous version.) > > Actually, this following fragment is illegal, because > you cannot use ANY inside a content model. I am not sure how > to read the XML-data format here, but I think this exposes > a flaw in their example: if pet can contain any subelements, > what use is it to say it can also contain a kitten subelement? > Duplicate paths are a little worrying, if that what they > have done. > > If it were desired to use ANY in this way (i.e. different > to how SGML uses it), then it could be coped with by > parametising includes and excludes in a similar fashion. > (Again I can provide example if needed, but I hope not.) > > ---------- > > From: Jonathan Robie > > To: ricko@allette.com.au > > > At 05:02 AM 9/30/97 +1000, Rick Jelliffe wrote: > > > > >If you want multiple inhereitance, then you can just > > >define a different suffix, and search through attributes > > >based on that to collect the inheritance tree. I can > > >provide an example if anyone is interested. > > > > Please! > > Here is a version which allows multiple inheritance. > (Some parenthesis problems fixed too.) > I have put in even empty attribute values, to make > the pattern uniform in every case, so please do not > confuse this simplicity for elaborateness! > > To extract the inheritance tree, collect all attributes > with ":inherit" suffix. I think the only novel thing > is that people are not used to wildcard searches on > attribute names, but this is only prejudice. > > Also, I think because some tools require precompiled > DTDs, there is a general view in some circles that > DTDs are always compiled, and always made prior > to the generation of the instance. But that is > not intrinsic to SGML. > > The PATTERN > ----------- > > This pattern reserves the suffixes: > contents for a parameter entity with the > element type's contents > attributes for a parameter entity with the > element type's attributes > inherit for a fixed attribute with the > element type's immediate inheritance > > The pattern is > " {CONTENT-MODEL} > {INHERITED-CONTENT-MODELS} "> > " {ATTRIBUTE-DECLARATIONS} > {INHERITED-ATTRIBUTE-DECLARATIONS} > {GI}:inherit CDATA #FIXED '' "> > ( %{GI}:contents; ) > > %{GI}:attributes; > > > Where the delimiters {} indicate parameters of the template > which you or your application edit in. > > The EXAMPLE > ----------- > > [ > > " ( pet | cat | dog )* " > " animal-friends:inherit CDATA #FIXED '' "> > ( %animal-friends:contents; )> > %animal-friends:attributes; > > > > > " ANY " > > " name ID #IMPLIED > owner ID #IMPLIED > pet:inherit CDATA #FIXED '' " > > > %pet:contents; > > %pet:attributes; > > > > > " ( %pet:contents;, kittens)? " > " lives NMTOKEN #IMPLIED > %pet:attributes; > cat:inherit CDATA #FIXED 'pet' "> > ( %cat:contents; ) > %cat:attributes; > > > > > " ( %pet:contents;, puppies?) " > " breed CDATA #IMPLIED > %pet:attributes; > dog:inherit CDATA #FIXED 'pet' "> > ( %dog:contents; ) > %dog:attributes; > > > ]> > > > > > > > > > Please note that I am not saying that this form is always > preferable to using AFs or XML-data. But it can be done > in XML as it stands now, keeping valid SGML declarations. > And, as has been mentioned, there should be interconversion > possible between the three forms, since they give the > same information. If XML-data requires the use of specialist > tools to mapulate, since it is so verbose, then this pattern > cannot either be regarded as excessively verbose either, > since the same kind of tools can be constructed to simplify > creating new objects. > > > Rick Jelliffe > > > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Wed Oct 1 00:36:28 1997 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:58:31 2004 Subject: revised Animal-friends implemented as a pattern (Re: XML-Data:advantages over DTD syntax?) Message-ID: <199709302240.IAA07282@jawa.chilli.net.au> I am trying to figure out where the two things fit. It seems to me that the SGML content model system is primarily concerned with describing the future of a document (what can go where). The XML-data schema system seems more concerned with the 'present' of a document. So SGML is concerned with information management, while XML-data is concerned with information retrieval. Does that seem right? AFs seem to sit in between these two extremes, being able to constrain futures and give more sophisticated labelling for data retrieval. -ricko xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Wed Oct 1 01:41:21 1997 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 16:58:31 2004 Subject: XML-Data: advantages over DTD syntax? References: Message-ID: <34318DEB.276B@hiwaay.net> Rob McDougall wrote: > > If I remember correctly, the advantages are listed in the spec. The main > advantage being that you can include the XML-Data definition within the XML > file itself, so that you now can send a completely self-describing file > that can be read by a single (XML) parser. > > Rob > Umm... You can include an SGML Declaration, DTD and instance in a single file and it is a completely self-describing file read by a single parser. I don't think that is the advantage of the Microsoft approach. I think you can do it with a simpler parser; however, the tradeoff is a harder to read schema, so a single parser writer (or several) gain on a few hours of effort but the user of the schema loses. If that *is* the tradeoff, it isn't much of a bargain. len bullard ips xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Wed Oct 1 05:03:04 1997 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 16:58:31 2004 Subject: XML-Data: advantages over DTD syntax? References: <199710010000.RAA09002@mehitabel.eng.sun.com> Message-ID: <3431BCED.3DCA@hiwaay.net> Murray Altheim wrote: > I've been (honestly) trying to give this XML markup schema idea a fair shake, > and while I wouldn't enjoy writing the DTD (based on the examples, I think > the syntax is verbose and ugly) I do think this does come down to an > advantage, at least as far as my own XML parser experiments go. I agree. It isn't that a single parser processes it, it is what can be done with it. UP FRONT: DTDs or Schemas. I think these are different animals to do different kinds of design. Past experiences with design efforts like HyTime, MID, etc. left me and still leave me puzzling about what should be defined in the markup language(s) and what is best done in the objects. > My parser builds a Java Vector array of what I call 'ContentObjects', and > I've enumerated types for the various types of content objects. Using the > same parser, I would simply add several more enumerated types (for element > declaration, attlist declaration, etc.) to the list and let the thing > attack a DTD'ed document instance. That would be obviously easier than > writing the parser to understand SGML markup declarations. Ok. Having never written a parser, I believe you. Creating objects using the XML/SGML markup as an interpreted source for properties is what XML should *standardize* IMO. What would be very interesting to hear is opinions on how much and which parts of the document framework properties should be expressible in the XML, and what parts should be in other notations. For example, we have to look at how scripting is to be done since despite SGML's resistance to procedural languages in SGML, internal scripting is a part of the modern document instance. Of course, the contenders are ECMAScript and Java (IMHO) because other notations within the framework support those languages as internal nodes. > But from that point on, figuring out how the document model is structured > seems pretty much the same, just a different approach on getting to the > declarations into the Vector array. I don't see any other particular > advantages to the syntax, and as I said earlier, it seems harder to read > (to me) and certainly more verbose. Also agreed. I guess I have problems with the ideas of using the instance syntax because I think of that as data (old SGML habit) and I think of a DTD as expressing automata. I understand how they have adapted that as attributes, but I don't like that model. To me, the element types are active. I am comfortable with the current DTD syntax. > By and large though I agree with you. DTDs are hard enough to read now; > adding all that extra markup cruft seems a step backward. It requires the > reader to compose the content model in their head based on interpretation > of the schema markup, which relies a great deal on whitespace (!) IMO. That's the difference, perhaps. In some sense, the schema approach is an exercise in entity/attribute modeling a la a relational background. A *mythical* SGML designer sees a document from which he is abstracting and to which he is adding structure in terms of Type/attribute(s)OfType that occur at some frequency and in some order. While some have certainly been able to express that relationally, it isn't the conceptual model I prefer for human-digestible text. That said, if it is enabling a more object-oriented capable syntax, then these are two ways to create markup for the same information. I have no problems with that. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Wed Oct 1 11:13:16 1997 From: ht at cogsci.ed.ac.uk (Henry S. Thompson) Date: Mon Jun 7 16:58:31 2004 Subject: XML-Data: advantages over DTD syntax? In-Reply-To: len bullard's message of Tue, 30 Sep 1997 22:01:01 -0500 References: <199710010000.RAA09002@mehitabel.eng.sun.com> <3431BCED.3DCA@hiwaay.net> Message-ID: <1029.199710010913@grogan.cogsci.ed.ac.uk> Without responding to the details of Murray and Len's exchange, two points: 1) Murray is of course right that the logic of content models is not made any easier by changing their notation; 2) Complexity is in the eye of the beholder, and the advantage to the user of being able to use the SAME graphical UI to construct both instance and schema might be taken to operate in favour of the schema approach in this area. ht ---- Note: I am not now nor have I ever been a Microsoft employee, nor is Steve De Rose. Microsoft paid for my trip to Redmond during which the foundations for the XML-Data document were laid, but they don't own or operate me. I've learned by working with our co-authors at Microsoft and their colleagues that there are people in Redmond who lack horns and are both technically competent and genuinely interested in standards. Accordingly, I'd be greatful if those commenting on this document could avoid the cheap rhetorical device of referring to "the Microsoft approach", implying thereby that it shouldn't be taken seriously because we all know that Microsoft are {only in it for the money, aiming at world domination, fundamentally duplicitous, . . .}, and concentrate on technical issues. ht xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jwrobie at mindspring.com Wed Oct 1 13:27:44 1997 From: jwrobie at mindspring.com (Jonathan Robie) Date: Mon Jun 7 16:58:31 2004 Subject: XML-Data: advantages over DTD syntax? Message-ID: <1.5.4.32.19971031122700.009dcdb4@pop.mindspring.com> At 10:13 AM 10/1/97 BST, Henry S. Thompson wrote: >Without responding to the details of Murray and Len's exchange, two >points: > >1) Murray is of course right that the logic of content models is not >made any easier by changing their notation; > >2) Complexity is in the eye of the beholder, and the advantage to the >user of being able to use the SAME graphical UI to construct both >instance and schema might be taken to operate in favour of the schema >approach in this area. Of course, if such an editor were designed using a Design Patterns approach, with a common abstract base class for the methods used to construct a DTD or a document, you could have the same thing. However, I'm not sure that I would *want* to construct the XML-Data schemas the same way I create XML documents - it looks like there would still be a fair amount of typing, more than I need to create a DTD. Of course, it *would* be nice to have graphical tools for creating DTDs which explicitly show the inheritance relationships. But I don't see how an alternative syntax helps me to create such tools. >Note: I am not now nor have I ever been a Microsoft employee, nor is >Steve De Rose. Microsoft paid for my trip to Redmond during which >the foundations for the XML-Data document were laid, but they don't >own or operate me. > >I've learned by working with our co-authors at Microsoft and their >colleagues that there are people in Redmond who lack horns and are >both technically competent and genuinely interested in standards. I do not believe that Microsoft owns you or Steve De Rose, nor that Microsoft employees are inherently evil. >Accordingly, I'd be greatful if those commenting on this document >could avoid the cheap rhetorical device of referring to "the Microsoft >approach", implying thereby that it shouldn't be taken seriously >because we all know that Microsoft are {only in it for the money, >aiming at world domination, fundamentally duplicitous, . . .}, and >concentrate on technical issues. I find it interesting how much you read into the phrase "the Microsoft approach" - after all, it *is* an approach advocated by Microsoft, as opposed to the approach which has been accepted by standardization committees. I really do see great value in having a standard approach, and one which is supported by tools like SP, Jade, etc. The standardization committees are there to ensure these standard approaches. Now if I look at the XML White Paper from Microsoft's "Standards" page for information on XML, it does not even mention the industry-standard DTDs, instead, it tells me "Microsoft has proposed a 'Document Type Definition' (DTD) syntax for expressing the schema for an XML document directly within XML itself, allowing XML data to describe its own structure. Expressing schemata within XML adds great power to the XML format because it makes it possible for software examining certain data to understand its structure without earlier knowledge about the data or its meaning." To me, it looks as though they are championing this as an alternative to DTDs. In fact, someone who did not already know about DTDs would have no idea that they exist in XML, or that the term 'Document Type Definition' were not invented by Microsoft. I know that you do not represent Microsoft, but I do think that Microsoft's marketing materials indicate a desire to support XML-Data instead of DTDs, and I am concerned about the possibility of a split in the industry similar to the Java wars that we are now experiencing. I would like to see a lot of the functionality of Architectural Forms or XML-Data in SGML/XML, and I would like to see it supported by standards. Does XML-Data do anything that XML with HyTime would not do? Is it only an alternative syntax? Jonathan *************************************************************************** Jonathan Robie jwrobie@mindspring.com http://www.mindspring.com/~jwrobie POET Software, 3207 Gibson Road, Durham, N.C., 27703 http://www.poet.com *************************************************************************** xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Wed Oct 1 13:30:05 1997 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:58:31 2004 Subject: XML-Data: advantages over DTD syntax? Message-ID: <199710011134.VAA01551@jawa.chilli.net.au> > From: len bullard > That said, if it is enabling a more object-oriented capable > syntax, then these are two ways to create markup for the > same information. I have no problems with that. One difference is also that XML does now include DTD declarations & PEs. These will not go away from XML. So any XML-data software that is also actually real XML *must* be more complicated than XML software that uses just the template method. So I think the argument for XML-data to ease the poor programmer's lot is, to a certain extent, based on a false choice. The template method may be a little ugly (though to me it is less ugly than XML-data, but that is purely a matter of familiarity), but such ugliness can be dealt with by building nice DTD management interfaces. Many of SGML's complications are because it has so many features to make data entry look nicer, but which do not improve the expressiveness of the SGML: minimisation, short-references, datatag, rank, etc. I tend to think XML-data is in fact the same kind of thing: it makes some easier to type but does not improve the expressiveness of XML. I want to see things that XML-data can do that SGML cannot now do (I am certainly not ruling it out--indeed I welcome it, it is an exciting prospect), either by templates or by architectural forms. (But, of course, I anticipate any great discoveries in this area will find their way into standard XML/SGML syntax sooner or later.) All three forms have three stages: STAGE templates AF XML-data object (text template) Already in SGML element primitive AF system types types schema SGML element types SGML element SGML elements definition types schema SGML elements SGML elements XML-data on-the-fly elements elements By the way, I want to also note that the Japanese Hiyama and Tsuchiya also have an paper on these kind of issues, which includes an interesting method for restricting possible values of inherited attributes as well. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at csclub.uwaterloo.ca Wed Oct 1 14:31:23 1997 From: papresco at csclub.uwaterloo.ca (Paul Prescod) Date: Mon Jun 7 16:58:32 2004 Subject: XML-Data: advantages over DTD syntax? Message-ID: <199710011231.IAA00207@calum.csclub.uwaterloo.ca> Henry S. Thompson wrote: > 2) Complexity is in the eye of the beholder, and the advantage to the > user of being able to use the SAME graphical UI to construct both > instance and schema might be taken to operate in favour of the schema > approach in this area. User interface designers should know how to present data from widely varying sources in a uniform manner. That is why the same class browser (e.g.) can work for C++ and Java, and the same file browser can work for local files and for file systems accessed via FTP (which are syntactically very different from local function calls!). Standard wordprocessors can wordprocess files in a variety of different formats. Valid reasons for standardizing on a single syntax for any of these things are: #1. to protect the user's data from control by a single vendor (obviously not an issue in this case) #2. to reduce the workload for implementors (which I suspect is the real issue in this case) and #3. to to reduce the cost of entry to the market (another issue in this case). In other words, as Len has pointed out, the DTDs-as-instances efforts are primarily focused on making life easier for vendors, perhaps at the cost of simplicity for users. If this results in more tools, then it might be a net win, but we don't know yet. Of course particular DTDs-as-instances proposals may have other benefits, but those benefits could as easily be added to DTD syntax as to some new syntax. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Wed Oct 1 16:50:01 1997 From: ht at cogsci.ed.ac.uk (Henry S. Thompson) Date: Mon Jun 7 16:58:32 2004 Subject: Animal-friends implemented as a pattern (Re: XML-Data:advantages over DTD syntax?) In-Reply-To: C M Sperberg-McQueen's message of Tue, 30 Sep 1997 17:17:28 -0500 References: <199709302217.RAA119476@tigger.cc.uic.edu> Message-ID: <1176.199710011449@grogan.cogsci.ed.ac.uk> Michael said: > > > " ( pet | cat | dog %extra-animal-friends-content )* " > > > >so that you could do > > > > > > > >[Note this is not valid XML, I don't think] > > I believe it is valid XML, at least according to the spec of early > August. The internal subset cannot have parameter entity references, > but there's no rule against parameter entity declarations in the > internal subset. Michael is absolutely correct -- I was confusing ENTITY definitions with CONTENT model definitions. One of the non-obvious consequences of the current (8879-inherited) situation is that the above is OK, but the following isn't: ht xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at csclub.uwaterloo.ca Wed Oct 1 17:08:41 1997 From: papresco at csclub.uwaterloo.ca (Paul Prescod) Date: Mon Jun 7 16:58:32 2004 Subject: XML-Data: advantages over DTD syntax? In-Reply-To: <199710011443.JAA40502@tigger.cc.uic.edu> from "C M Sperberg-McQueen" at Oct 1, 97 09:43:44 am Message-ID: <199710011508.LAA06081@calum.csclub.uwaterloo.ca> > From your mouth to the vendors' ears. If you ever see an SGML > editor that supports DTD editing with more than an ASCII editor and > a help screen saying "Gee, you're on your own here, kid", then let > me know. The only DTD editors I know of are specialized tools which > do not provide any document-instance interface at all. That's because no user interface designer with a modicum of pride would argue that documents and instances *should* use the same graphical interface. This is in fact a huge step backwards. Editor user interfaces should be getting more and more customized towards particular SGML applications, not more and more generalized, trying to be all things to all people. We could encode programming languages and user interfaces in SGML too, but I'm not going to give up my UI builder or IDE! > Well, my reason for thinking it makes sense is not any of the the > above. I tell people every day that SGML is a good syntax for > structured information of all kinds. There are some who seem to do > this just for laughs -- me, I believe it. Do you think SGML is a good syntax for Scheme programs? For C++ programs? For regular expressions? For BNF grammars? I don't -- not by a long shot. Trying to shove everything into SGML may be convenient from a processing point of view, but it would be a nightmare for those that must work with the actual textual representation of those formats. > So it seems natural to me > to ask why I shouldn't use SGML syntax for structured data about > document types. The answers I've heard so far (because 8879 blessed > this other syntax; because SGML isn't that good after all; because it > is heresy) are all laughable. SGML *is* that good after all -- for what it was designed to do. It is NOT that good for washing windows, fixing cars or describing content models. Maybe SGML should be extended so that it is friendly for representing arbitrary non-document data, but right now, it is pretty much a crap shoot whether a particular non-document domain will find a convenient representation in SGML. If you don't intend to edit the data directly, then it doesn't matter, but if you do, then it certainly does. > Yes. And it seems far more likely that we'll get some useful > new thinking about document grammar and validation if we do the > preliminary thinking in a notation that doesn't steer our minds > straight back to clause 11 of 8879. That's true. And perhaps once the ideas are hashed out they can be re-expressed in a format that is type-able (SGML DTD syntax or not). SGML *is* very convenient for prototyping formats that you have not written a parser for yet. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jarle.stabell at dokpro.uio.no Wed Oct 1 19:02:37 1997 From: jarle.stabell at dokpro.uio.no (Jarle Stabell) Date: Mon Jun 7 16:58:32 2004 Subject: XML-Data: advantages over DTD syntax? (and some wishes) Message-ID: <01BCCE9C.8EF45AD0@xyplex39.uio.no> Paul Prescod wrote: ------- "In other words, as Len has pointed out, the DTDs-as-instances efforts are primarily focused on making life easier for vendors, perhaps at the cost of simplicity for users. If this results in more tools, then it might be a net win, but we don't know yet." I think the DTDs-as-instances also benefits new users, why should they have to learn two syntaxes instead of one? One of the first questions which entered my mind when seeing a DTD for the first time was: "Why didn't they code this using SGML, why a completely separate syntax for this?". (Persons who already know SGML may of course object to DTD-syntax not being SGML-syntax, but I think this will be even more the "feeling" among XML users) I assume most people today won't edit DTDs (either today's version or XML-Data or similar versions) in the "raw" text format. They will of course use tools, visualizing the hierarchy (XML-Data's extends/implements), selecting values from comboboxes etc. I think some advanced functionality is very difficult with todays DTD's, as (to my very limited SGML-knowledge) many things are "simulated/hacked" by using parameter entities. I think this parameter entity (macro) approach is much less "semantic", and is much more difficult for a tool to handle. Mr. Prescod also wrote: ------- "Of course particular DTDs-as-instances proposals may have other benefits, but those benefits could as easily be added to DTD syntax as to some new syntax." Adding new constructions to DTD syntax would force parser builders to update the "lower parts" of their parsers/lexers, but in a DTD-as-instances version the upgrade would only affect the "semantic" part of the engine. And more importantly, it would be easier to communicate to users that "now this (DTD)element has gotten this new attribute, which means X etc", instead of having to introduce the new syntax for DTD-encoding and then explaining it's semantics. (This is why we like SGML/XML in the first place, not needing to use more or less unstandard syntactic encodings) I don't view XML-Data as the new syntax, quite the opposite, this I find completely "XML syntax". I view the DTD syntax as another "non-XML" syntax (although this is of course technically uncorrect according to the draft). A few XML wishes: 1. Please incorporate the tag, it would take a parser-writer 5 minutes to implement it, as well as save bandwitdth, diskspace, typing and in some cases ease reading. (It could also be used to write hard-to-understand/maintain documents, but that's up to the user) 2. Allow non-quoted attribute values. I guess support for this is also a 5 minutes project for the parser-writer. 3. Add a paragraph to the XML standard document explaining why character references should be resolved before storing the string as the value of the entity. Is it to allow useful tricks like the example in the "C. Expansion of Entity and Character References" section, as well as making it rocket science to use character references in entity declarations? (or only for compatibility with SGML?) I don't know if this is "theoretically" possible, but it could save weeks of implementation time if all entity declarations could be parsed locally, and not forcing expansion and reparsing in all occurrences. It there are no essential idioms getting lost in such a simplification, I definitely think such a simpler model would make life simpler for end-users as well, not only for the software vendors. (Perhaps people wouldn't need to play parsers (and knowing many detailed rules) to read/write/debug their documents?) Cheers, Jarle Stabell xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at csclub.uwaterloo.ca Wed Oct 1 19:35:41 1997 From: papresco at csclub.uwaterloo.ca (Paul Prescod) Date: Mon Jun 7 16:58:32 2004 Subject: XML-Data: advantages over DTD syntax? (and some wishes) In-Reply-To: <01BCCE9C.8EF45AD0@xyplex39.uio.no> from "Jarle Stabell" at Oct 1, 97 07:02:18 pm Message-ID: <199710011735.NAA13283@calum.csclub.uwaterloo.ca> > I think the DTDs-as-instances also benefits new users, why should they > have to learn two syntaxes instead of one? Because if the two syntaxes are well chosen for their particular data types then learning two will be faster than learning one. Let me suggest an analogy: "Why should I learn these mathematical symbols when this can all be described in English." Because you only benefit from the English description *in the very short term*. > One of the first questions which entered my mind when seeing a DTD for > the first time was: > "Why didn't they code this using SGML, why a completely separate syntax > for this?". I asked this myself. It is a natural question. I also asked this the first time I saw a regular expression. "Why does this bit of code look so different from the rest of the program?" It turns out that there was a good reason. I also asked that the first time I saw a functional language. It turns out that there was a good reason there too. We should fight forever to avoid difference for differences sake, but not difference for readability's sake. > I assume most people today won't edit DTDs (either today's version or > XML-Data or similar versions) in the "raw" text format. They will of > course use tools, visualizing the hierarchy (XML-Data's > extends/implements), selecting values from comboboxes etc. If this is the case then the syntax is irrelevant for those people and they are thus not relevant to the discussion of syntax. > I think some advanced functionality is very difficult with todays DTD's, > as (to my very limited SGML-knowledge) many things are > "simulated/hacked" by using parameter entities. > I think this parameter entity (macro) approach is much less "semantic", > and is much more difficult for a tool to handle. No argument. But now you are complaining about the features available in DTDs and not with the syntax. You must invent the features either way. > Mr. Prescod also wrote: > ------- > "Of course particular DTDs-as-instances proposals may have other = > benefits, > but those benefits could as easily be added to DTD syntax as to some > new syntax." > > > Adding new constructions to DTD syntax would force parser builders to > update the "lower parts" of their parsers/lexers, but in a > DTD-as-instances version the upgrade would only affect the "semantic" > part of the engine. I haven't disputed the argument that DTDs as implemnentors make life easier for implementors. I just find specious the argument that it makes life directly easier for users. It will only (perhaps) make user's lives easier if more implementors adopt it because it makes their lives easier. But I do not find it better pedagogically or "type-o-graphically". I am already dreading my future classes when I will have to teach this to people who are already inclined to confuse levels (mix up elements and element types etc.) > And more importantly, it would be easier to communicate to users that > "now this (DTD)element has gotten this new attribute, which means X > etc", instead of having to introduce the new syntax for DTD-encoding and > then explaining it's semantics. (This is why we like SGML/XML in the > first place, not needing to use more or less unstandard syntactic > encodings) And now we must explain to users that when we say "this element has got a new attribute, we don't actually mean *this element*", but the element type described this element. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jarle.stabell at dokpro.uio.no Wed Oct 1 22:23:46 1997 From: jarle.stabell at dokpro.uio.no (Jarle Stabell) Date: Mon Jun 7 16:58:32 2004 Subject: XML Wishes (, quotes and entity resolution) Message-ID: <01BCCEB8.B4745B40@xyplex39.uio.no> Jarle Stabell wrote: > 1. Please incorporate the tag, it would take a parser-writer 5 minutes to implement it, as well as save bandwitdth, diskspace, typing and in some cases ease reading. (It could also be used to write hard-to-understand/maintain documents, but that's up to the user) Murray Altheim writes: As both a document type designer, a parser writer, and a document author, I think one of the main advantages to XML is the requirement of explicitly- named end tags. [JS] Agree. [MA] The save-typing argument is moot in that most people will probably not hand-edit tags. [JS] Maybe. But I know I will. Therefore I would like it. :-) I really think Doe John are faster/easier to read than Doe John and I keep seeing lots of things like this. Having this possibility would perhaps also prevent people from using "cryptic" abbreviations as element type names/ID's I agree that closing an element having subelements with a would be a "bad thing" for a document writer to do. [MA] For those that do, having the explicit end tags is probably a Very Good Thing, in that it saves confusion. And while it maybe only takes '5 minutes' (NOTHING takes five minutes) to add in a parser, suddenly a simple parser must build a document tree in order to know which element is being closed by '', which makes simple parsers into much more complicated ones. This is not a benefit. [JS] Ok, I didn't think of the possibility of anyone building XML parsers without building the document tree. (I won't disclose any estimate for building the document tree...:-) ) > [JS] 2. Allow non-quoted attribute values. I guess support for this is also a 5 minutes project for the parser-writer. [MA] We're up to ten minutes. Actually, this makes the parser more complicated, since knowing that attribute values are delimited allows a simple 'scan-literal' approach, ie., if the first character after the equals sign is a single quote, one scans to the next single quote. If a double, scan to the next double. If they are optional things get much more complicated, and we now must care about what type of characters are in the content of the literal. Options and minimization features generally add a lot of work for parser writers. [JS] I think the complexity this adds for the parser writers are neglible, it's a very local thing, typically located to a single method/routine. If having the possibility of omitting the quotes would benefit users, perhaps by making it more SGML compatible, I definitely think one should allow this. I've already seen documents on the web stated as being XML documents without the quotes. If some parsers allow it (I don't know!), then the other parsers would seem unecessary "stubborn" from a user's perspective. > [JS] 3. Add a paragraph to the XML standard document explaining why character references should be resolved before storing the string as the value of the entity. [MA I believe we would lose an enormous amount of expressive power and put unnecessary restrictions. [JS] This may very well be true. I'm not an SGML expert. I'd love to see an example of this. I think a good example of this would make XML parser writers much more motivated when implementing it! :-) [MA] Recursive entity resolution is not programmatically that much extra work [JS] Perhaps not the resolution itself. But making it possible to give the user good error messages (and displaying the location(s) where the error takes place) I assume is quite a lot of work. Perhaps not so for the direct coding, but to come up with the necessary architecture/design. I also think this simpler model would make for simpler API's for tool builders, at least for tools needing to have info about where entities were invoked in the original document. (f.i. tools which updates/synchronizes documents need this info, in order to not "flatten it out".) [MA] and allows for various important SGML facilities. And remember that one of the explicit goals for XML is SGML compatibility. [JS] Yes. But it would be very sad if this made XML substantially more complex (without any other benefit than compatibility), both for users and tool vendors. Cheers, Jarle Stabell xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dak at sq.com Wed Oct 1 23:54:47 1997 From: dak at sq.com (dak@sq.com) Date: Mon Jun 7 16:58:32 2004 Subject: XML Wishes (, quotes and entity resolution) In-Reply-To: Your message of "Wed, 01 Oct 1997 22:23:37 +0200." <01BCCEB8.B4745B40@xyplex39.uio.no> Message-ID: | [MA] For those that do, having the explicit end tags is probably a | Very Good Thing, in that it saves confusion. And while it maybe only | takes '5 minutes' (NOTHING takes five minutes) to add in a parser, | suddenly a simple parser must build a document tree in order to know | which element is being closed by '', which makes simple parsers | into much more complicated ones. This is not a benefit. | | [JS] Ok, I didn't think of the possibility of anyone building XML | parsers without building the document tree. (I won't disclose any | estimate for building the document tree...:-) ) Hmm. It seems to me that this only requires a stack, not the full tree, but...it is highly error prone, and having Yet Another Variant Form isn't a very good thing. Languages need the notion of helpers for "logical parenthesis" matching, and preferably some redundancy, in order to help the user get this right, and to help the processing system detect when it is wrong. (Indentation is another possibility, but that isn't available here.) Best, Dak -- David A. 'Dak' Keldsen Software Development Manager SoftQuad Inc. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Thu Oct 2 01:34:40 1997 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 16:58:32 2004 Subject: XML-Data: advantages over DTD syntax? (and some wishes) References: <01BCCE9C.8EF45AD0@xyplex39.uio.no> Message-ID: <3432DDE7.48DEC226@allette.com.au> Jarle Stabell wrote: > I assume most people today won't edit DTDs (either today's version or XML-Data or similar versions) in the "raw" text format. They will of course use tools, visualizing the hierarchy (XML-Data's extends/implements), selecting values from comboboxes etc. Not me. Data analysis takes ten times as long as it typing up the DTD. Anyone using a methodical approach to DTD design will tell you that typing it up is only the culmination of a much more extensive process involving much diagramming and arguing with colleagues. I'm not a fan of the tools - I don't want to construct a letter by picking sentences out of a word processor either. > I don't view XML-Data as the new syntax, quite the opposite, this I find completely "XML syntax". I view the DTD syntax as another "non-XML" syntax (although this is of course technically uncorrect according to the draft). Although the syntax may not be new, the imposition or reservation of element types to describe the schema is a long way from what XML is trying to achieve. Also, lets face facts, this is intended as a replacement for DTDs, not a complementary mechanism. The logistics of parsing with either and/or both structures as well as the overhead involved with describing the schema as an optional element in the DTD (presumably this would be required?) would quickly put everyone off. I'm not suggesting that there's no requirement for extension to what we currently have, but if as you say, people will use pretty tools to create DTDs, I think the method outlined by Rick Jelliffe is preferable and will presumably be invisible. I don't think an XML syntax that doesn't even fit properly with XML design goals is the answer. -- Regards Marcus Carr email: mrc@allette.com.au _______________________________________________________________ Allette Systems (Australia) email: info@allette.com.au Level 10, 91 York Street www: http://www.allette.com.au Sydney 2000 NSW Australia phone: +61 2 9262 4777 fax: +61 2 9262 4774 _______________________________________________________________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu Oct 2 01:38:35 1997 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:58:32 2004 Subject: AFs and the DPH (Was Animal Friends, etc.) In-Reply-To: <1.5.4.32.19970930111016.009ead94@pop.mindspring.com> Message-ID: <3.0.1.16.19971002003147.2dcf0a06@pop3.demon.co.uk> At 07:10 30/09/97 -0400, Jonathan Robie wrote: >At 09:35 AM 9/30/97 BST, Henry S. Thompson wrote: > >So now we have all the players! > Just one more player, please - the DPH (desperate Perl Hacker)... whom I shall try to represent :-) [For those not on xml-sig, this mythical character is assumed to be an eager customer for XML, and ignorant of SGML. Everyone on XML-SIG used to agree that XML must be accessible to the DPH - I hope that's still true. If it's not - and that XML is a commercial-only enterprise - then I'll rethink things.] As a hacker I know very little about AFs, but have been trying to find out more from the SG/XML community. Many people (e.g. Eliot Kimber and James Clark) have been very helpful but I suffer from not having: - a beginner's introduction to AFs on the WWW - simple free working software to see how they work [If I'm wrong in this, I'll be delighted.] It is clear that AFs have passionate supporters in the *ML community, but their message isn't easily spread beyond it. My summary, gleaned over the last year goes something like this: - AFs allow (partial) mapping of one data structure (in SGML) to another. There are restrictions on mapping inconsistent content models, for example. - among the benefits of AFs are that you can manage different 'namespaces' and can alias elements or attributes - AFs are defined in standard SGML and parsers parse them 'without realising what they are' - an AF expert can do very elegant things with AFs. However, all the benefits from AFs have to be realised by having an AF-aware processor (I differentiate parser from processor - an ESIS stream could be input into an AF-aware processor). My understanding is that: - there are no freely available AF processors - generic AF processors are beyond the ability (or at least the time) for a DPH to write from scratch Therefore AFs are only available to largish groups with time and/or money... This rules out the DPH. I have the impression that AFs play the same sort of meta-role as interfaces in Java. They organise things for you but don't actually write any code for you! So a pre-requisite is an AF-engine. These general problems come up with the XLL spec, where there is mapping of attributes and elements. Maybe an AF engine makes implementing XLL easier - certainly there is quite a lot of implied architecture in the spec. I have consistently been asking whether it makes sense to build some generic processing tools for XLL but the feedback that I seem to have got is that this is application-dependent (i.e. different processors need to be written for different DTDs). If so, this is a major deterrent to the use of XLL and AFs. So - is this analysis anywhere near true? And if so, is the XML community going to develop freely available tools for this type of requirement :-)? P. Peter Murray-Rust, Director VSMS, domestic net connection Virtual Hyperglossary specialities xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jwrobie at mindspring.com Thu Oct 2 02:18:28 1997 From: jwrobie at mindspring.com (Jonathan Robie) Date: Mon Jun 7 16:58:32 2004 Subject: AFs and the DPH (Was Animal Friends, etc.) Message-ID: <1.5.4.32.19971002001807.00ac7cb0@pop.mindspring.com> At 12:31 AM 10/2/97, Peter Murray-Rust wrote: >At 07:10 30/09/97 -0400, Jonathan Robie wrote: >>At 09:35 AM 9/30/97 BST, Henry S. Thompson wrote: >> >>So now we have all the players! >Many people (e.g. Eliot Kimber and James >Clark) have been very helpful but I suffer from not having: > - a beginner's introduction to AFs on the WWW Yes, I have found a few things on the web, but nothing as clear and well-written as Microsoft's paper on XML-Data. If someone would just follow the organization of Microsoft's paper to present the same information on Architectural Forms, that would be a great way to demonstrate that Architectural Forms have the same power, and would make it much easier to make fair comparisons. > - simple free working software to see how they work >[If I'm wrong in this, I'll be delighted.] Well, SP supports AFs, and you can use an option (I think the -A option) to strip out the part of a document that belongs to a particular architectural form. I'm not sure what other software does. > - AFs allow (partial) mapping of one data structure (in SGML) to another. >There are restrictions on mapping inconsistent content models, for example. > - among the benefits of AFs are that you can manage different 'namespaces' >and can alias elements or attributes > - AFs are defined in standard SGML and parsers parse them 'without >realising what they are' > - an AF expert can do very elegant things with AFs. One cool thing about AFs is that if you parse documents created with them into an SGML repository, using fixed attributes to identify the forms, then it is easy to do database searches for the information that your particular group is interested in by searching on the attributes. This really impressed me at the Kona mixer. >However, all the benefits from AFs have to be realised by having an >AF-aware processor (I differentiate parser from processor - an ESIS stream >could be input into an AF-aware processor). My understanding is that: > - there are no freely available AF processors > - generic AF processors are beyond the ability (or at least the time) for >a DPH to write from scratch > >Therefore AFs are only available to largish groups with time and/or >money... This rules out the DPH. I suspect the DPH could do some interesting things to support queries for information contained in architectural forms. I am still a rank beginner at AFs, but I had a real breakthrough when I got my hands on some real data, the documents from the Kona proposal, which were created using architectural forms. These documents can be found at "http://www.mcis.duke.edu/standards/HL7/committees/sgml/kona.htm". The example that helped me the most was the Surgical Pathology example. Jonathan *************************************************************************** Jonathan Robie jwrobie@mindspring.com http://www.mindspring.com/~jwrobie POET Software, 3207 Gibson Road, Durham, N.C., 27703 http://www.poet.com *************************************************************************** xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu Oct 2 03:36:08 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:32 2004 Subject: AFs and the DPH (Was Animal Friends, etc.) References: <3.0.1.16.19971002003147.2dcf0a06@pop3.demon.co.uk> Message-ID: <3432FA9D.1E38B608@technologist.com> Peter Murray-Rust wrote: > Just one more player, please - the DPH (desperate Perl Hacker)... whom I > shall try to represent :-) [For those not on xml-sig, this mythical > character is assumed to be an eager customer for XML, and ignorant of SGML. > Everyone on XML-SIG used to agree that XML must be accessible to the DPH - > I hope that's still true. If it's not - and that XML is a commercial-only > enterprise - then I'll rethink things.] XML must be useful to the DPH for simple processing. Nobody has ever claimed that (s)he would be able to do anything that XML could do. That's why we are even still *considering* parameter entities. Nobody expects the DPH to handle them. They should just ignore them and process the content of the document to do whatever massaging or transformation they wanted to do. > As a hacker I know very little about AFs, but have been trying to find out > more from the SG/XML community. Many people (e.g. Eliot Kimber and James > Clark) have been very helpful but I suffer from not having: > - a beginner's introduction to AFs on the WWW > - simple free working software to see how they work > [If I'm wrong in this, I'll be delighted.] nsgmls is simple, free working software that supports AFs nicely. I'm sure Eliot has written a hundred tutorials. The reason that AFs are hard to learn (in my opinion) is because they mix concepts that in most people's experience are separate: interfaces and transformations. If you think of them only as interface inheritance, you miss half of it. If you think of them only as transformations you also miss some of it. > - AFs allow (partial) mapping of one data structure (in SGML) to another. > There are restrictions on mapping inconsistent content models, for example. As with any non-Turing complete transformation technology. (although you could imagine a transformation technology that is more powerful than AFs but not Turing complete) > - among the benefits of AFs are that you can manage different 'namespaces' > and can alias elements or attributes Yes and no. Imagine you use AFs to map from a CML document to an HTML document (if their models are close enough). Then obviously the element type names in the latter are in a different namespace (the HTML DTD) than the names in the former (the TEI DTD). Usually when you think about "managing namespaces" you are looking for some mechanism of *combining them* and making a single, combined namespace, not merely providing alternate names for every element, one in one namespace and another in another namespace. > - AFs are defined in standard SGML and parsers parse them 'without > realising what they are' That's true. The parser just passes them to the AF engine as attributes. > - an AF expert can do very elegant things with AFs. Elegant? I can't say, it's too subjective. Powerful and convenient? Yes. > However, all the benefits from AFs have to be realised by having an > AF-aware processor (I differentiate parser from processor - an ESIS stream > could be input into an AF-aware processor). My understanding is that: > - there are no freely available AF processors NSGMLS (and thus Jade) embeds one. > - generic AF processors are beyond the ability (or at least the time) for > a DPH to write from scratch I dunno. Since only a few people have done it I don't know much about the algorithms that are necessary. A simple AF subset would probably be pretty easy. It would be a simple 1 to 1 mapping from elements in one DTD to elements in another. What's so difficult about that? > Therefore AFs are only available to largish groups with time and/or > money... This rules out the DPH. Not true. I use them by myself for tiny little projects. The smaller the project the more I benefit from them. Writing a full transformation program in Python or Perl is cost effective for large projects with many documents conforming to a single DTD. But if I write a DTD for one document then I don't want to go to that hassle so I just use AFs to do the mapping for me. > I have the impression that AFs play the same sort of meta-role as > interfaces in Java. They organise things for you but don't actually write > any code for you! So a pre-requisite is an AF-engine. I'm not sure what you mean by that. Of course you need an AF-smart processor to use them just as you need an XSL processor to use XSL. > These general problems come up with the XLL spec, where there is mapping of > attributes and elements. Maybe an AF engine makes implementing XLL easier - > certainly there is quite a lot of implied architecture in the spec. XLL *is* an architecture. If you understand XLL then you understand architectures. Other concepts are just more subtle variations on the same concept. The XLL standard defines some elements that describe an "interface". You can declare that elements conform to that interface through an attribute. An XLL-smart processor ignores the GI and works with the element through the interface -- as if it were really an instance of the element type defined in the XLL standard. That's an architecture. The more subtle variations allow mapping attributes to elements and vice versa and other things like that. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Thu Oct 2 03:48:10 1997 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 16:58:32 2004 Subject: XML-Data: advantages over DTD syntax? References: <199710010000.RAA09002@mehitabel.eng.sun.com> <3431BCED.3DCA@hiwaay.net> <1029.199710010913@grogan.cogsci.ed.ac.uk> Message-ID: <3432FD25.5680@hiwaay.net> Henry S. Thompson wrote: > > Without responding to the details of Murray and Len's exchange, two > points: > > 1) Murray is of course right that the logic of content models is not > made any easier by changing their notation; Nor is it made harder. > 2) Complexity is in the eye of the beholder, and the advantage to the > user of being able to use the SAME graphical UI to construct both > instance and schema might be taken to operate in favour of the schema > approach in this area. Complexity is in the work required to create the artifact. One might also argue that if it is harder to enter the artifact by hand or read it with that same eye, and that a graphical UI is required to enter it, it is more complex and the advantage of the UI is the crutch to support the complexity. Anyway, without responding to the details of our exchange, you miss the point: Does XML-Data offer more functionality? If so, then good. If not, then why bother? I thought it did, but I could be wrong. > Note: I am not now nor have I ever been a Microsoft employee, nor is > Steve De Rose. Microsoft paid for my trip to Redmond during which > the foundations for the XML-Data document were laid, but they don't > own or operate me. Good to know. I guess I should formerly swear my lack of initiation in the Hermetic Order of the Golden Dawn as long as we forswearing organizations out to conquer Middle Earth. ;-) On with the show. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Thu Oct 2 04:13:03 1997 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 16:58:32 2004 Subject: XML-Data: advantages over DTD syntax? (and some wishes) References: <199710011735.NAA13283@calum.csclub.uwaterloo.ca> Message-ID: <343302F8.39E4@hiwaay.net> Paul Prescod wrote: > >> Jarle Stabell wrote: > > I assume most people today won't edit DTDs (either today's version or > > XML-Data or similar versions) in the "raw" text format. They will of > > course use tools, visualizing the hierarchy (XML-Data's > > extends/implements), selecting values from comboboxes etc. > > If this is the case then the syntax is irrelevant for those people and > they are thus not relevant to the discussion of syntax. I also have to add that in all of the years I have done this sort of work, the argument that "they won't edit this by hand" is the first one to fall apart as soon as the spec is released. Editors follow slowly and even when they do, the ability to "hack the ASCII" is a capability you should defend with your last breath. This was an argument presented for VRML as well (by Gavin Bell, as a matter of fact). Truth is, we use the editors for construction of complex objects, yes, but we typically debug in ASCII with line numbers. They will edit the DTDs raw. Count on it. It's actually rather easy to do and that is from some years of doing it with the Bad, Hard, We Hate The Syntax but Love the Concepts parent: SGML. Frequency, occurrence and membership just aren't that hard to grasp. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Thu Oct 2 04:25:47 1997 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 16:58:32 2004 Subject: AFs and the DPH (Was Animal Friends, etc.) References: <3.0.1.16.19971002003147.2dcf0a06@pop3.demon.co.uk> Message-ID: <343305EF.4DEE@hiwaay.net> Peter Murray-Rust wrote: > As a hacker I know very little about AFs, but have been trying to find out > more from the SG/XML community. Many people (e.g. Eliot Kimber and James > Clark) have been very helpful but I suffer from not having: > - a beginner's introduction to AFs on the WWW > - simple free working software to see how they work Check the www.techno.com page and see if the article by Ralph Ferris and Victoria Newcomb is still there. If that isn't available, send email to Steve Newcomb or Ralph and see if it is still around. Folks have been publishing beginner's articles on AFs for some time (including Tag several years ago). Now, of course, AFs were a moving target like XML for that period too. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu Oct 2 11:05:41 1997 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:58:32 2004 Subject: AFs and the DPH In-Reply-To: <1.5.4.32.19971002001807.00ac7cb0@pop.mindspring.com> Message-ID: <3.0.1.16.19971002094027.325769ee@pop3.demon.co.uk> At 20:18 01/10/97 -0400, Jonathan Robie wrote: >At 12:31 AM 10/2/97, Peter Murray-Rust wrote: [...] > >>Many people (e.g. Eliot Kimber and James >>Clark) have been very helpful but I suffer from not having: >> - a beginner's introduction to AFs on the WWW > >Yes, I have found a few things on the web, but nothing as clear and >well-written as Microsoft's paper on XML-Data. If someone would just follow >the organization of Microsoft's paper to present the same information on >Architectural Forms, that would be a great way to demonstrate that >Architectural Forms have the same power, and would make it much easier to >make fair comparisons. > Thanks to all who have replied so far. From what has been said it sounds as if AFS are easier to implement and use than appears. Given that XLL could be implemented using an AF mechanism and that many people seem to think that AFs will be important in the future of X*L, is this a useful time to think about what is required for their implementation (ideally for me in Java). If it really is within DPH ability, this shouldn't be a major job. May I reduce my ignorance further by asking some simple questions: - must a DTD (or at least an ATTLIST) always be provided with the document instance? - if so, how is this information going to be transmitted to the AF-aware processor. Will Xapi-J do this? - is any other information required, or can the processor deduce from the values transmitted that this is an AF? - if other information is required, how is it to be included (does SP require additional arguments/input, for example) - is the working of the processor completely automatic? - what is the output of the processor? (a grove?). Can it be represented by an ESIS stream? P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Thu Oct 2 11:17:39 1997 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:58:33 2004 Subject: XML-Data: advantages over DTD syntax? Message-ID: <199710020922.TAA06639@jawa.chilli.net.au> > From: len bullard > Anyway, without responding to the details of our exchange, you > miss the point: Does XML-Data offer more functionality? If > so, then good. If not, then why bother? I thought it did, > but I could be wrong. Someone asked me to state more clearly some deficiencies I see in XML-data. Let me again preface by saying that this is only with the current version of XML-data, not with its intent. However, let no-one be confused that we will be left with a whole additional standard that in a large part replicates SGML, and for which particular gurus will be needed, if they do find things that XML-data can do that SGML cannot. 1) Non-standardness: XML-data is proprietary. 2) Not generalized: Presumably XML-data is being developed to solve some real problem (they wouldn't be paying someone to come up with good ideas that just sound good, would they?). Since we do not have access to the problems that XML-data is supposed to solve, we have no way of testing whether it does indeed provide a good generalized approach that is significantly better or more flexible than SGML. (Reading between the lines, I think XML-data may be targeted at retrofitting slack HTML documents with inline generalized markup. In other words, it probably has the assumption that we cannot use DTDs, because the target data is so slack that no DTD is possible. So they need something that can just markup parts of the data. This is an interesting problem, and one that ISO 8879 clearly does not address, except by external HyTime/XLL pointers into data, I guess. Can anyone in the XML-data conspiracy confirm this? It is a wild guess :-) 3) Not markup: The approach of not separating out what can be known about data ahead of time (or for every instance) from the details of the particular instance is justified under the slogan "all metadata is data", which avoids the question "should data that has different significance be marked-up differently?". XML-data removes this difference between declarations and markup (so everything is a declaration, in a way). This has a major impact on what can be known about a document type for system builders: it prevents "precompiled" applications. If the XML-data declaration-elements get a flag or a something to say "this can be pre-compiled" then you just return to having a special markup convention, just like ISO 8879 declarations. 4) Not a human readable-language. Contend models are simple, terse and convenient. Of course, diagrams etc are simpler. But XML-data's verbosity will make reading any kind of lengthy DTD more complicated. For some other particular technical issues: a) ANY is not defined. Is the same as SGML's ANY? Is it an error if you extend the content model of an ANY base element? b) Order of extended element content models is not clear. If I derive a "cat" element type from "animal" element type and say it can also contain a "purr-volume" element type, is there any way of constraining where this element can go? Ordering information in content models is vital for many processing application, and for integrity. Can such additional element types go anywhere, or only at the end, or where. c) Is there anyway of preventing derived elements from adding additional tags in particular places? I think one weakness in current SGML is the lack of a #ANY keyword for use inside content models, e.g. I do think this is a much better solution than anything XML-data currently proposes. XML-data really does not seem to have thought of content models as being a tool to manage information (i.e. to frustrate well-intentioned users from adding innappropriate elements in mission-critical places), just to describe it. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Oct 2 15:40:08 1997 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:58:33 2004 Subject: AFs and the DPH (Was Animal Friends, etc.) In-Reply-To: <3.0.1.16.19971002003147.2dcf0a06@pop3.demon.co.uk> References: <1.5.4.32.19970930111016.009ead94@pop.mindspring.com> <3.0.1.16.19971002003147.2dcf0a06@pop3.demon.co.uk> Message-ID: <199710021338.JAA00476@unready.microstar.com> Peter Murray-Rust writes: > However, all the benefits from AFs have to be realised by having an > AF-aware processor (I differentiate parser from processor - an ESIS stream > could be input into an AF-aware processor). My understanding is that: > - there are no freely available AF processors > - generic AF processors are beyond the ability (or at least the time) for > a DPH to write from scratch > > Therefore AFs are only available to largish groups with time and/or > money... This rules out the DPH. Actually, none of these statements is true. I am working on a very large project where people are using Omnimark and even ACL to work with architectural forms. Further more, the SP-family of programs -- nsgmls, jade, sgmlnorm, etc. -- have a full-featured architectural engine built right in. In the simplest sort of architectural processing, you simply designate an attribute name -- say, Biblio -- and take actions based on the attribute's value for different element types. For example, with ... you would note that the value of the Biblio attribute is "entry" (or, more generally, that "entry" is the architectural form of this element) and process the contents accordingly. With ... you would note that the architectural form is "author", and process accordingly. In a different document type, you might have ... However, since the value of the Biblio attribute is the same, you should be able to process it with the same code. A very easy way to set this up is to use #FIXED attributes declared in the DTD: That way, when an author includes David Megginson your software will see David Megginson Of course, if you are doing DTD-less parsing (ick), you can specify the architectural form attribute values explicitly on the elements, as in the earlier examples. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Thu Oct 2 15:54:22 1997 From: ht at cogsci.ed.ac.uk (Henry S. Thompson) Date: Mon Jun 7 16:58:33 2004 Subject: XML-Data: advantages over DTD syntax? In-Reply-To: len bullard's message of Wed, 01 Oct 1997 20:47:17 -0500 References: <199710010000.RAA09002@mehitabel.eng.sun.com> <3431BCED.3DCA@hiwaay.net> <1029.199710010913@grogan.cogsci.ed.ac.uk> <3432FD25.5680@hiwaay.net> Message-ID: <1468.199710021354@grogan.cogsci.ed.ac.uk> Len writes: > Henry S. Thompson wrote: > > > Anyway, without responding to the details of our exchange, you > miss the point: Does XML-Data offer more functionality? If > so, then good. If not, then why bother? I thought it did, > but I could be wrong. Why bother is because notation actually matters. I repeat my previous point: why doesn't Java restrict itself to one boolean operator, namely XOR? All logical functions can be expressed with it. Does including AND, OR and NOT offer more functionality? Since not, then why bother? Answer, because matching the notation to the intended use improves understanding, maintainability and ease of realising design goals. In the area of document grammar specification, XML-Data offers no functionality which cannot be duplicated by extensive use of parameter entities, or at worst tedious by-hand expansion. But it provides that functionality in a transparent (in some cases MUCH more transparent) way, and in my view that makes it worth bothering. > > > Note: I am not now nor have I ever been a Microsoft employee, nor is > > Steve De Rose. Microsoft paid for my trip to Redmond during which > > the foundations for the XML-Data document were laid, but they don't > > own or operate me. > > Good to know. I guess I should formerly swear my lack of > initiation in the Hermetic Order of the Golden Dawn as long > as we forswearing organizations out to conquer Middle Earth. ;-) Yes, but what about the Trans-Bulgarian Women's Quiltmakers' Club? :-) Honest, in 1967 this appeared along with 100s of other more plausible organisations such as American Friends of the Abraham Lincoln Brigade in the list on the back of the US Gov't Contractors' security clearance form, as something membership in which you had to deny to get the clearance. ht xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Thu Oct 2 16:54:04 1997 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:58:33 2004 Subject: Proposal: Architectural Forms in XML Message-ID: <199710021453.KAA00319@unready.microstar.com> Perhaps it would help along the namespace/XML-data/architectural-form discussion if we had a concrete proposal for architectural forms in XML. I gave a talk on architectural forms at the XML Dev Day last August, and sent around a handout which included a complete syntax for implementing XML architectural forms. I have done a quick, rough (and manual) conversion of the handout from LaTeX to HTML and published it to the following URL: http://home.sprynet.com/sprynet/dmeggins/xml-arch.html This proposal allows for well-formed architectural parsing, avoids the need for data attributes, and provides reasonable defaults for different levels of processing. That said, this is just a handout, not really really a proper spec (and it probably also contains embarrassing typos). Before I invest the time in writing a full spec, I'd like to know what kind of interest there is in this and what reactions people have to the details. Please note also that I am currently presenting this as a private proposal, and am not acting on behalf of my employer, Microstar, in any official capacity. Thanks, and all the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From eliot at isogen.com Thu Oct 2 18:14:27 1997 From: eliot at isogen.com (W. Eliot Kimber) Date: Mon Jun 7 16:58:33 2004 Subject: AFs and the DPH References: <3.0.1.16.19971002094027.325769ee@pop3.demon.co.uk> Message-ID: <343356A7.76CF@isogen.com> Peter Murray-Rust wrote: > May I reduce my ignorance further by asking some simple questions: > - must a DTD (or at least an ATTLIST) always be provided with the document > instance? Only if you want to avoid putting the mapping attributes in start tags (moral equivalent of qualifying names ala colonization) or want to use element types that are different from the architectural names (remembering that by default, if an element has the same GI as a form in the active architecture, it is mapped to it automatically). The architecture mechanism was designed when you always had attribute declarations, so it is optimized for reducing instance syntax by providing attribute list declarations. NOTE: If the architectures meta-DTD is identical to what would be the document's DTD if it had one (for documents without DTDs), then all mapping is automatic and there's no need for additional attributes in the instance. In other words, given a document with an explicit DTD, you can remove the DTD, make it an architectural meta-DTD, and get the same processing result. This is why I think architectures are key to the success of XML: it lets you eat the cake of DTD-less documents and still have it (because the architecture processing gives you all the validation and processing you need, but only when you want it and not when you don't). > - if so, how is this information going to be transmitted to the AF-aware > processor. Will Xapi-J do this? I'm not sure what you mean by 'this information'. Do you mean the mapping itself? If the attributes are declared or specified, they're simply part of the properties of the elements and any AF-aware processor can examine the attributes to look to see if there are any it recognizes. Automatic mapping is slightly more work, because you have to know what form you are looking for (either because you have hard-coded it into your processing (e.g., if (gi == 'some-form') {}) or because you are also looking at the meta-DTD). In the simple case, your AF-aware processor is expecting certain element forms and attributes and simply looks for them, rather than trying to do generalized architecture processing. This is funtionally equivalent to having a processor tied to a particular DTD except that you look first for architectural attributes and *then* at GIs, rather than starting with GIs. Any abstract API (like Xapi-J) can be usefully enhanced to make getting architecture-specific properties easier. For example, in the work I've done with ADEPT*Editor, I created a set of functions to resolve architectural mappings--these functions could easiliy be provided by ADEPT out of the box. Likewise, any sufficiently complete document API probably provides primitives that can be combined to provide architecture-support functions--you can either do it yourself (as I have for ADEPT and DSSSL) or make them part of the base API. The set of core functions is fairly small: - ArchFormOf - Returns the form, if any, of an element for a given arch. Applies architectural automapping rules. - IsArchitectural? - Returns true if the element is architecture for an arch - LocalAttributeNameFor - Given an architectural attribute name, returns the name of the attribute of the element that is mapped to the architectural (i.e., resolves architectural attribute name remapping). - ArchAttValue - Returns the value of an given architectural attribute. Returns the architecture-defined default (if known, either because the meta-DTD is available or because knowledge of the architecture is hard-coded somewhere). - ArchContentOf - Resolves architectural content remapping - IsArchForm? - Returns true if a given element is of the specified form With these functions, it's pretty easy to do architecture-aware processing just as you do DTD-aware processing, e.g.: $archform = &ArchFormOf($current_node, 'XML-LINK'); if ($archform == 'SIMPLE') { print STDERR "Found a simple link element\n"; } elsif ($archform == 'EXTENDED') { print STDERR "Found an extended link element\n"; } Versions of these functions are provided as part of the hy-lib.pl Perl package mentioned below. > - is any other information required, or can the processor deduce from the > values transmitted that this is an AF? With a few reasonable assumptions, yes, the attributes alone are sufficient. For completely general architectural processing, you need to either have built-in knowledge of the architecture meta-DTD (e.g., you have a hard-wired HyTime-aware processor like Panorama) or you also process the meta-DTD (like the code I posted recently to the Arbortext mailing list for doing generalized architectural processing with ADEPT*Editor). > - if other information is required, how is it to be included (does SP > require additional arguments/input, for example) The attributes alone are sufficient for quick-and-dirty, hardwired processors (such as my hy-lib.pl Perl code [now quite out of date, but still illustrative of DPH architecture processing], which can be found at "www.isogen.com/demos"). To be more generalized, you need the architecture declaration attributes (provided as data attributes of the architecture notation in the AFDR-defined approach). These attributes tell you what the attribute names are for the attributes used in the document, such as the name of the attribute that specifies the form mapping, the renaming attribute, and so on. You need to process these attributes when you want to process documents that don't use the defaults. SP provides this processing automatically as part of its generalized architectural processing. Again, for the simple case, you can just require people to use the defaults and not worry about it. This is what Panorama does. > - is the working of the processor completely automatic? I'm not sure what you mean by 'working' in this case. > - what is the output of the processor? (a grove?). Can it be represented > by an ESIS stream? The abstract architectural processing model is one in which there are at least two groves: the one constructed from the parsing of the client document and the one constructed from the architectural instance. The nodes in the "architectural instance grove" have pointers back to the nodes in the client document grove from which they were derived, so that when processing the arctectural grove you can get back to the client document grove. In this model, you can process either grove and always get whatever information you need about the other. The GroveMinder product, being developed by TechnoTeacher, provides this grove model, for example. Because architectural processing happens *after* parsing, ESIS isn't really relevant, because ESIS just tells you about the parse result, from which you either have enough information to do the architectural processing (at a minimum, the values of attributes). The only question is one of completeness: does your ESIS output include everything from the original document you need. For example, if you want to do complete architectural processing, you need to get the attributes declared for the architecture notation, but many systems don't provide data attributes unless they're associated with a particular entity, which architecture notations are not. For the simplest, "I'm just looking for attributes" case, normal ESIS is sufficient to enable architecture-aware processing. Cheers, Eliot xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Thu Oct 2 18:34:56 1997 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:58:33 2004 Subject: AFs and the DPH Message-ID: <199710021639.CAA20160@jawa.chilli.net.au> > From: W. Eliot Kimber > This is why I think architectures are key to > the success of XML: it lets you eat the cake of DTD-less documents and > still have it (because the architecture processing gives you all the > validation and processing you need, but only when you want it and not > when you don't). This seems a very good and important point. If the problem is how to represent occassional structures in well-formed documents, then AFs represent an external form, XML-data represents an inline form, and ISO 8879 declarations represent a header form. But, I think that a document with AFs cannot be regarded as being declaration-less, since either the declarations have to be implicitly built into the application, or be explicit in the form of a DTD outside. The horrible thing is that, of course, there is no reason why an XML-data schema could not itself be a meta-DTD! I think the issue of direct modelling (SGML templates or XML-data) versus indirect modelling (AFs) should be distinguished from the issue of the goodness of ISO 8879 declaration syntax versus XML-data non-standard syntax. AFs, as a mechanism, are syntax-neutral to a great extent. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Thu Oct 2 19:48:15 1997 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:58:33 2004 Subject: AFs and the DPH In-Reply-To: <343356A7.76CF@isogen.com> References: <3.0.1.16.19971002094027.325769ee@pop3.demon.co.uk> Message-ID: <3.0.1.16.19971002183059.3487e60a@pop3.demon.co.uk> Many thanks to all who have contributed - please keep doing so. I am gradually attaining enlightenment. [I think unless you have actually practised AFs it's quite difficult for some people to pick them from the abstract description.] The benefit of doing it here is that the explanations are archived :-) At 09:09 02/10/97 +0100, W. Eliot Kimber wrote: >Peter Murray-Rust wrote: > >> May I reduce my ignorance further by asking some simple questions: >> - must a DTD (or at least an ATTLIST) always be provided with the document >> instance? > >Only if you want to avoid putting the mapping attributes in start tags >(moral equivalent of qualifying names ala colonization) or want to use >element types that are different from the architectural names >(remembering that by default, if an element has the same GI as a form in >the active architecture, it is mapped to it automatically). This was also made very clear in David Megginson's proposal - thanks. > [...] > >> - if so, how is this information going to be transmitted to the AF-aware >> processor. Will Xapi-J do this? What I meant was - 'If a document + all associated components (DTDs, PIs, etc.) has been processed by an Xapi=-J compliant tool, will the information recoverable from that be enough to show that AF-processing is required and how to do it?' Taking DavidM's syntax (which is XML compatible and where the AF-ness is indicated by PIs, it seems the answer is 'yes' :-) > >I'm not sure what you mean by 'this information'. Do you mean the >mapping itself? If the attributes are declared or specified, they're >simply part of the properties of the elements and any AF-aware processor >can examine the attributes to look to see if there are any it >recognizes. ^^^^^^^^^^^ This implies that the AF-processor is either hardcoded to a particular AF (I suppose XML-LINK might fall into this category, but I'd feel unhappy for any more specific hardcoding), or a general AF-process is fed a list of the attributes it needs to look out for. >Automatic mapping is slightly more work, because you have >to know what form you are looking for (either because you have >hard-coded it into your processing (e.g., if (gi == 'some-form') {}) or >because you are also looking at the meta-DTD). In the simple case, your >AF-aware processor is expecting certain element forms and attributes and >simply looks for them, rather than trying to do generalized architecture >processing. This is funtionally equivalent to having a processor tied >to a particular DTD except that you look first for architectural >attributes and *then* at GIs, rather than starting with GIs. How does DavidM's proposal fit into this where - presumably - you indicate what you are looking for via PIs? Is this also the way that SP works? If so, could there be an agreed set of PIs? > >Any abstract API (like Xapi-J) can be usefully enhanced to make getting >architecture-specific properties easier. For example, in the work I've Is this something worth doing? [...] > >With these functions, it's pretty easy to do architecture-aware >processing just as >you do DTD-aware processing, e.g.: > >$archform = &ArchFormOf($current_node, 'XML-LINK'); >if ($archform == 'SIMPLE') { > print STDERR "Found a simple link element\n"; >} elsif ($archform == 'EXTENDED') { > print STDERR "Found an extended link element\n"; I'm not quite clear how this differs from simply processing attribute values (which is what I do at present). [...] P. > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From smith at Adobe.COM Thu Oct 2 20:51:35 1997 From: smith at Adobe.COM (Bruce T. Smith) Date: Mon Jun 7 16:58:33 2004 Subject: XML-Data: advantages over DTD syntax? (and some wishes) In-Reply-To: <343302F8.39E4@hiwaay.net> References: <199710011735.NAA13283@calum.csclub.uwaterloo.ca> Message-ID: <3.0.3.32.19971002114849.0099f100@mail-345.corp.adobe.com> At 09:12 PM 10/1/97 -0500, len bullard wrote: >Paul Prescod wrote: >> >>> Jarle Stabell wrote: > >> > I assume most people today won't edit DTDs (either today's version or >> > XML-Data or similar versions) in the "raw" text format. They will of >> > course use tools, visualizing the hierarchy (XML-Data's >> > extends/implements), selecting values from comboboxes etc. >> >> If this is the case then the syntax is irrelevant for those people and >> they are thus not relevant to the discussion of syntax. > >I also have to add that in all of the years I have done this >sort of work, the argument that "they won't edit this by hand" >is the first one to fall apart as soon as the spec is released. >Editors follow slowly and even when they do, the ability to >"hack the ASCII" is a capability you should defend with your >last breath. This was an argument presented for VRML as >well (by Gavin Bell, as a matter of fact). Truth is, we >use the editors for construction of complex objects, yes, >but we typically debug in ASCII with line numbers. I think Len's reaction is a bit extreme. I may debug in ASCII with line numbers, but I don't edit with ed or EDLIN. If I wanted to hack lots of HTML or SGML or VRML with a text editor, I'd choose emacs or some other editor that was aware of syntactic structures. There's a big difference between saying that syntax doesn't have to be convenient to type in a dumb editor and abandoning ASCII altogether. It looked to me like Jarle was just making the former claim. _ Bruce xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu Oct 2 21:00:17 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:33 2004 Subject: XML-Data: advantages over DTD syntax? (and some wishes) References: <199710011735.NAA13283@calum.csclub.uwaterloo.ca> <3.0.3.32.19971002114849.0099f100@mail-345.corp.adobe.com> Message-ID: <3433EF58.470C646D@technologist.com> Bruce T. Smith wrote: > I think Len's reaction is a bit extreme. I may debug in ASCII with line > numbers, but I don't edit with ed or EDLIN. If I wanted to hack lots of > HTML or SGML or VRML with a text editor, I'd choose emacs or some other > editor that was aware of syntactic structures. Sure, but Emacs does not hide verbosity. The complaint with XML-DATA is primarily verbosity. All of the XML syntax buries the information. The more dense the information you are working with, the smaller you want your delimiters. XML tags are , delimiters. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu Oct 2 21:02:50 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:33 2004 Subject: AFs and the DPH References: <3.0.1.16.19971002094027.325769ee@pop3.demon.co.uk> <3.0.1.16.19971002183059.3487e60a@pop3.demon.co.uk> Message-ID: <3433EFEE.26029AA9@technologist.com> Peter Murray-Rust wrote: > I'm not quite clear how this differs from simply processing attribute > values (which is what I do at present). The difference is that you expect a step *between* parsing and application processing. Typically the application doesn't care about what the original forms of the elements are, so it just gets a stream of architectural events intead of a stream of events directly from the XML parser. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Thu Oct 2 21:44:53 1997 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:58:33 2004 Subject: XML-Data: advantages over DTD syntax? Message-ID: <199710021949.FAA22832@jawa.chilli.net.au> > From: Henry S. Thompson > In the area of document grammar specification, XML-Data offers no > functionality which cannot be duplicated by extensive use of parameter > entities, or at worst tedious by-hand expansion. But it provides that > functionality in a transparent (in some cases MUCH more transparent) > way, and in my view that makes it worth bothering. I call as exhibit #1 FrameMaker's EDD (element description definition???) format. The developers of XML-data should look hard at it, and the lessons to be drawn from it. It seems to have been conceived as a better SGML than SGML (Frame also had an additional requirement to embed structure into their interchange format too). It is more friendly/verbose than SGML's declarations, provides slightly more expressivity, slightly better attribute types, and includes style specification for element-types-in-context. It also has a cool syntax-directed editor. It is certainly far more developed than XML-data (in that it has been developing and in use for a few years now), though it does not use XML elements for declarations nor use inheritance mechanisms. I have watched EDD with interest, and the first comment to make is that they have had to match SGML's primitive capabilities, even in things that did not seem requirements to them at first. XML-data would have to do the same, I'd expect, unless it offers significant benefits in some new area. EDD has become a very large product, but I found that using it (in the recent FrameMaker+SGMLs) was a little tedious, in that it did not offer such enormous advantages to make warranted the duplication of SGML declaration in its own syntax. For exhibit #2, I call the Pinnacles or DOCBOOK DTDs, expressed in XML-data. Can someone whip it up, and we can get a much better feel for how readable it is as a declaration syntax for a nice juicy DTD? The number of derived element types will probably be much fewer than the number of base element types, surely. Without exhibit #2, I really don't feel comfortable making claims that XML-data is verbose (or reading claims that is is more transparent!) Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri Oct 3 05:13:07 1997 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 16:58:33 2004 Subject: XML-Data: advantages over DTD syntax? References: <199710020922.TAA06639@jawa.chilli.net.au> Message-ID: <34346285.1C8A@hiwaay.net> Rick Jelliffe wrote: > > 2) Not generalized: Presumably XML-data is being developed > to solve some real problem (they wouldn't be paying someone > to come up with good ideas that just sound good, would they?). > Since we do not have access to the problems that XML-data is > supposed to solve, we have no way of testing whether it does > indeed provide a good generalized approach that is significantly > better or more flexible than SGML. That is the requirements problem. > (Reading between the lines, I think XML-data may be targeted > at retrofitting slack HTML documents with inline generalized > markup. Hmm, could be. It may also applicable to metatagging databases for template designs in relational systems that include fragments for tables. Look at the template concepts in the latest developers mag for MS users. Look: this IS the problem with no requirements. We have no real way of knowing why a different schemata of the same functional capability is required. That is why I keep asking: what does it do we can't do with XML and a DTD? I'm not fighting it: I want to know why. > In other words, it probably has the assumption that > we cannot use DTDs, because the target data is so slack that > no DTD is possible. So they need something that can just > markup parts of the data. This is an interesting problem, > and one that ISO 8879 clearly does not address, except by > external HyTime/XLL pointers into data, I guess. Well, that is an interesing problem then. If we want to go out there a bit further, we can talk about it as term vectors, velocity, voxels, etc. We can have a self-adaptive data structure that automatically classifies elements by spatial distribution (see works of Mathew Chalmers). As a matter of fact, the use of DTDs or whatever schema could make his algorithms using force vectors work even better by reducing the stress factors prior to the computational cycles. That actually works. Easy to navigate to in 3D. Do I want really lossy schemata with regards to frequency and occurrence ranges to do that? No, I don't think so. > Can anyone in the XML-data conspiracy confirm this? It is > a wild guess :-) I think oligarchy is the right word, not conspiracy. It is the W3C policy for process. Not a good deal for the community really. Too many of us have to sit on this list with our pens too silent about the discussions, decisions and the reasons. That is neither fair nor optimum. > 3) Not markup: The approach of not separating out what > can be known about data ahead of time (or for every > instance) from the details of the particular instance > is justified under the slogan "all metadata is data", > which avoids the question "should data that has different > significance be marked-up differently?". A rose is a rose is a rose... kinda meaningless. A rose(1) is a rose(2) is a rose(3) is what she meant. ;-) > 4) Not a human readable-language. Contend models are simple, > terse and convenient. Of course, diagrams etc are simpler. > But XML-data's verbosity will make reading any kind of > lengthy DTD more complicated. To me that means parsing in my head as I read the model is harder. However, equivalent expressiveness isn't a sufficient reason to use another notation for the schemata unless other benefits are found that are compelling. I don't think teaching DTDs to XML newbies is going to be that hard. Never was to SGML newbies until I made them read very complex DTDs. Now, here is an interesting issue: when the schema/DTD becomes complex, or an aggregate emerges by automated means (eg, using the force system to classify aggregates), how hard will the schemata of either syntax be too read? IOW, people wanted parameter entities very badly. This may be where they count for something more than string substitution because they are one way to label topical aggregation in an automated classification system such as Chalmers describes. > b) Order of extended element content models is not clear. > If I derive a "cat" element type from "animal" element type > and say it can also contain a "purr-volume" element type, > is there any way of constraining where this element > can go? Ordering information in content models is vital > for many processing application, and for integrity. > Can such additional element types go anywhere, or only > at the end, or where. That gets to the point of the DTD: frequency and occurrence are explicitly defined. That is the really powerful part of SGML with regards to a hierarchical instantiation. > c) Is there anyway of preventing derived elements from > adding additional tags in particular places? > > I think one weakness in current SGML is the lack of a > #ANY keyword for use inside content models, e.g. > I like that. > XML-data > really does not seem to have thought of content models > as being a tool to manage information (i.e. to frustrate > well-intentioned users from adding innappropriate elements in > mission-critical places), just to describe it. Properties of objects in which the object library already knows the things the DTD would tell it? len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri Oct 3 05:22:40 1997 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 16:58:34 2004 Subject: XML-Data: advantages over DTD syntax? References: <199710010000.RAA09002@mehitabel.eng.sun.com> <3431BCED.3DCA@hiwaay.net> <1029.199710010913@grogan.cogsci.ed.ac.uk> <3432FD25.5680@hiwaay.net> <1468.199710021354@grogan.cogsci.ed.ac.uk> Message-ID: <343464D0.4CA1@hiwaay.net> Henry S. Thompson wrote: > > Len writes: > > Henry S. Thompson wrote: > > > > > Anyway, without responding to the details of our exchange, you > > miss the point: Does XML-Data offer more functionality? If > > so, then good. If not, then why bother? I thought it did, > > but I could be wrong. > > Why bother is because notation actually matters. I repeat my previous > point: why doesn't Java restrict itself to one boolean operator, > namely XOR? All logical functions can be expressed with it. Does > including AND, OR and NOT offer more functionality? Since not, then > why bother? Answer, because matching the notation to the intended use > improves understanding, maintainability and ease of realising design goals. That use(s) needs to be spelled out better. I repeat: the problem with XML development is still the lack of requirements. If that has to be expressed in terms of "What SGML DTDs Don't Do and We Need", so much the better. Requirements give more people time to plan designs and submit proposals. > In the area of document grammar specification, XML-Data offers > no functionality which cannot be duplicated by extensive use of > parameter entities. But it provides that functionality in a > transparent (in some cases MUCH more transparent) way, and in my view > that makes it worth bothering. Then SGML/XML can do it. I'm not sure transparency isn't an eye of the beholder idea like complexity. Tim notes a big shift in audience SGML literacy. I'm not sure what to make of that remark, but if we are teaching two ways to do the same thing before the first XML spec is even published with regards to schemas, we're screwing up in a big way. I need a better definition of "transparency". That is qualitative, it seems, and all qualitative requirements have to be sold by compelling example because that is often chosen on taste. > Yes, but what about the Trans-Bulgarian Women's Quiltmakers' Club? :-) Hey, my grandmother was a quilter! len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri Oct 3 05:57:21 1997 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 16:58:34 2004 Subject: XML-Data: advantages over DTD syntax? (and some wishes) References: <199710011735.NAA13283@calum.csclub.uwaterloo.ca> <3.0.3.32.19971002114849.0099f100@mail-345.corp.adobe.com> Message-ID: <34346CEF.1120@hiwaay.net> Bruce T. Smith wrote: > > At 09:12 PM 10/1/97 -0500, len bullard wrote: > >Paul Prescod wrote: > >> > >>> Jarle Stabell wrote: > > > >> > I assume most people today won't edit DTDs (either today's version or > >> > XML-Data or similar versions) in the "raw" text format. They will of > >> > course use tools, visualizing the hierarchy (XML-Data's > >> > extends/implements), selecting values from comboboxes etc. > >> > >> If this is the case then the syntax is irrelevant for those people and > >> they are thus not relevant to the discussion of syntax. > > > >I also have to add that in all of the years I have done this > >sort of work, the argument that "they won't edit this by hand" > >is the first one to fall apart as soon as the spec is released. > >Editors follow slowly and even when they do, the ability to > >"hack the ASCII" is a capability you should defend with your > >last breath. This was an argument presented for VRML as > >well (by Gavin Bell, as a matter of fact). Truth is, we > >use the editors for construction of complex objects, yes, > >but we typically debug in ASCII with line numbers. > > I think Len's reaction is a bit extreme. I may debug in ASCII with line > numbers, but I don't edit with ed or EDLIN. If I wanted to hack lots of > HTML or SGML or VRML with a text editor, I'd choose emacs or some other > editor that was aware of syntactic structures. Beauty of it is, one doesn't have to. Extreme? No. Empirical observation. The claim that "people just won't use text editors" has been made in several XML arguments of late. I think history suggests otherwise. > There's a big difference between saying that syntax doesn't have to be > convenient to type in a dumb editor and abandoning ASCII altogether. It > looked to me like Jarle was just making the former claim. It doesn't have to be convenient. It usually is a cut and paste job. It looked to me like "people won't do this" and that claim is usually false. If however, that is precisely what he meant, then I agree. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Fri Oct 3 06:01:33 1997 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 16:58:34 2004 Subject: XML-Data: advantages over DTD syntax? (and some wishes) References: <199710011735.NAA13283@calum.csclub.uwaterloo.ca> <3.0.3.32.19971002114849.0099f100@mail-345.corp.adobe.com> <3433EF58.470C646D@technologist.com> Message-ID: <34346DEC.3636@hiwaay.net> Paul Prescod wrote: > > Bruce T. Smith wrote: > > I think Len's reaction is a bit extreme. I may debug in ASCII with line > > numbers, but I don't edit with ed or EDLIN. If I wanted to hack lots of > > HTML or SGML or VRML with a text editor, I'd choose emacs or some other > > editor that was aware of syntactic structures. > > Sure, but Emacs does not hide verbosity. The complaint with XML-DATA is > primarily verbosity. All of the XML syntax buries the information. The > more dense the information you are working with, the smaller you want > your delimiters. XML tags are , delimiters. Right. This is precisely the comment made about SGML DTDs for structured procedural scripting languages. They are "ugly" to look at if one has been a C programmer, and really nasty to type. Witness MID: worked like a charm. Really hard on the typist until reusable framgments emerged. MIL-D-87289 is another example. In fact, many of the modular DTDs that used element name hierarchies to simulate object-qualities had this problem. Colonized namespaces probably will as well. len xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Fri Oct 3 11:41:38 1997 From: ht at cogsci.ed.ac.uk (Henry S. Thompson) Date: Mon Jun 7 16:58:34 2004 Subject: XML-Data: advantages over DTD syntax? In-Reply-To: "Rick Jelliffe"'s message of Fri, 3 Oct 1997 05:44:38 +1000 References: <199710021949.FAA22832@jawa.chilli.net.au> Message-ID: <1648.199710030941@grogan.cogsci.ed.ac.uk> Rick writes: > I call as exhibit #1 FrameMaker's EDD (element description definition???) > format. The developers of XML-data should look hard at it, and the > lessons to be drawn from it. It seems to have been conceived as a > better SGML than SGML (Frame also had an additional requirement to > embed structure into their interchange format too). I have no experience with it, but I'll make an effort soon to have a look, thanks. > For exhibit #2, I call the Pinnacles or DOCBOOK DTDs, expressed > in XML-data. Can someone whip it up, and we can get a much better > feel for how readable it is as a declaration syntax for a nice > juicy DTD? The number of derived element types will probably be > much fewer than the number of base element types, surely. Without > exhibit #2, I really don't feel comfortable making claims that > XML-data is verbose (or reading claims that is is more transparent!) Absolutely right. I hope by SGML 97 in Washington to have done exactly this (although I'm likely to use the TEI DTD). ht xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Fri Oct 3 11:50:11 1997 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:58:34 2004 Subject: XML-Data: advantages over DTD syntax? Message-ID: <199710030955.TAA16683@jawa.chilli.net.au> > From: Henry S. Thompson > Absolutely right. I hope by SGML 97 in Washington to have done > exactly this (although I'm likely to use the TEI DTD). Pinnacles might be a better test case, in that (from memory) it has quite a lot of fielded-records element types, and it has its "reflections" system with a big use of IDs. But TEI may be more the target of XML-data, since it is clearly intended to be used as a base for actual DTDs. Anyway, good luck. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Fri Oct 3 19:49:58 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:34 2004 Subject: AFs and the DPH References: <3.0.1.16.19971002094027.325769ee@pop3.demon.co.uk> <343356A7.76CF@isogen.com> Message-ID: <34352F47.D0C63C2A@technologist.com> W. Eliot Kimber wrote: ... > NOTE: If the architectures meta-DTD is identical to what would be the > document's DTD if it had one (for documents without DTDs), then all > mapping is automatic and there's no need for additional attributes in > the instance. In other words, given a document with an explicit DTD, > you can remove the DTD, make it an architectural meta-DTD, and get the > same processing result. This is why I think architectures are key to > the success of XML: it lets you eat the cake of DTD-less documents and > still have it (because the architecture processing gives you all the > validation and processing you need, but only when you want it and not > when you don't). I don't understand this. How does turning the DTD into a "architectural DTD"* help anything? Just as in ordinary XML you can process the (perhaps architectural) DTD if you want to validate, or skip if you don't want to. > Any abstract API (like Xapi-J) can be usefully enhanced to make getting > architecture-specific properties easier. For example, in the work I've > done with ADEPT*Editor, I created a set of functions to resolve > architectural mappings--these functions could easiliy be provided by > ADEPT out of the box. Do your functions handle the mapping of attribute nodes to content, content nodes to attributes, minimization and the other neat transformational features of the AFDR? Do you think that the full suite of transformations is appropriate for XML, or merely element to element mappings? In my own thinking I have found it hard to figure out how one would efficiently implement those without having an entire second grove in memory. Let's say you *did* have both groves available. Then you could do a query in the architectural grove for elements that do not really correspond to any particular element in the source grove. How would you do that query *without* having both groves available? Is the only reasonable way to implement archforms to double (or triple, or quadruple, or....) the amount of memory taken up by groves in memory? Also it seems to me that in architectural forms you can *either* get a single architectural stream, as is the case in the output of Jade, or you can get multiple fully constructed groves (as in GroveMinder), but I wonder if it is possible to do stream-based architectural processing of multiple groves at once? In other words is there a way to build something like an ESIS that provides multiple groves at once and makes links between them? Could you comment on what this stream would look like? I think that it is admirable that archforms work in these two different modes, but I might like to take advantage of the best features of both modes at once -- access to all architectural "views" and stream-based processing. Is this feasible? I think it comes back to my earlier question about emulating multiple groves without building them. This seems feasible to me in the simple case, but my head gets muddled when I start thinking about minimization and content/attribute remapping. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at techno.com Fri Oct 3 22:25:43 1997 From: peter at techno.com (Peter Newcomb) Date: Mon Jun 7 16:58:34 2004 Subject: AFs and the DPH In-Reply-To: <34352F47.D0C63C2A@technologist.com> (message from Paul Prescod on Fri, 03 Oct 1997 13:45:43 -0400) Message-ID: <199710032023.QAA04467@exocomp.techno.com> [Paul Prescod on Fri, 03 Oct 1997 13:45:43 -0400] > W. Eliot Kimber wrote: > ... >> NOTE: If the architectures meta-DTD is identical to what would be the >> document's DTD if it had one (for documents without DTDs), then all >> mapping is automatic and there's no need for additional attributes in >> the instance. In other words, given a document with an explicit DTD, >> you can remove the DTD, make it an architectural meta-DTD, and get the >> same processing result. This is why I think architectures are key to >> the success of XML: it lets you eat the cake of DTD-less documents and >> still have it (because the architecture processing gives you all the >> validation and processing you need, but only when you want it and not >> when you don't). > I don't understand this. How does turning the DTD into a "architectural > DTD"* help anything? Just as in ordinary XML you can process the > (perhaps architectural) DTD if you want to validate, or skip if you > don't want to. I think that what Eliot is referring to here is the ability to hard-code support for an architecture into a processor and then still be able to use it on documents that might inherit also from other architectures, but that do not have DTDs. Thus you can re-use software that expects a certain DTD on documents that do not actually have DTDs. Eliot? >> Any abstract API (like Xapi-J) can be usefully enhanced to make getting >> architecture-specific properties easier. For example, in the work I've >> done with ADEPT*Editor, I created a set of functions to resolve >> architectural mappings--these functions could easiliy be provided by >> ADEPT out of the box. > Do your functions handle the mapping of attribute nodes to content, > content nodes to attributes, minimization and the other neat > transformational features of the AFDR? Do you think that the full suite > of transformations is appropriate for XML, or merely element to element > mappings? In my own thinking I have found it hard to figure out how one > would efficiently implement those without having an entire second grove > in memory. Let's say you *did* have both groves available. Then you > could do a query in the architectural grove for elements that do not > really correspond to any particular element in the source grove. How > would you do that query *without* having both groves available? Is the > only reasonable way to implement archforms to double (or triple, or > quadruple, or....) the amount of memory taken up by groves in memory? As far as I know, it's not possible to have an architectural element without a client (source) element. The attribute and content remapping facilities are probably critical to good architecture and document design. The minimization features are likely to be useful, but are not necessary. What are the other "neat transformational features" you're talking about? As far as architectural groves requiring much more memory than the original grove is concerned, if your grove implementation is literal (i.e., each node in a grove is its own object), then of course it's true. But if you implement groves that way, you're already incurring a huge avoidable overhead even for a single SGML grove. Remember that a grove needn't actually be implemented as a grove; all that matters is that the interface to the data, however it is stored, looks like a grove. Thus, while there is likely to be some overhead required for efficient architectural grove implementations, it should be comparable or even less than the overhead incurred by object oriented systems in order to provide efficient type-querying and method-lookup for objects. > Also it seems to me that in architectural forms you can *either* get a > single architectural stream, as is the case in the output of Jade, or > you can get multiple fully constructed groves (as in GroveMinder), but I > wonder if it is possible to do stream-based architectural processing of > multiple groves at once? In other words is there a way to build > something like an ESIS that provides multiple groves at once and makes > links between them? Could you comment on what this stream would look > like? I think that it is admirable that archforms work in these two > different modes, but I might like to take advantage of the best features > of both modes at once -- access to all architectural "views" and > stream-based processing. Is this feasible? Sure... I think what this amounts to is an event interface where an element object passed as an element event, for example, includes the entire architectural element tree built from that element. This probably could be done with some not-too-extensive modifications to SP. > I think it comes back to my earlier question about emulating multiple > groves without building them. This seems feasible to me in the simple > case, but my head gets muddled when I start thinking about minimization > and content/attribute remapping. You have to build them, but you only have to build them when they're asked for. And then you only have to build the specific parts that are asked for (i.e., from specific client elements). (In SGML, this can get a little tricky, however, when you take into account #CURRENT attributes and suchlike. The XML case should be much simpler.) -peter -- Peter Newcomb TechnoTeacher, Inc. peter@petes-house.rochester.ny.us peter@techno.com http://www.petes-house.rochester.ny.us http://www.techno.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mtbryan at sgml.u-net.com Fri Oct 3 22:52:02 1997 From: mtbryan at sgml.u-net.com (Martin Bryan) Date: Mon Jun 7 16:58:34 2004 Subject: XML-Data: advantages over DTD syntax? (and some wishes) Message-ID: At 19:02 01/10/97 +0200, Jarle Stabell wrote: >I think the DTDs-as-instances also benefits new users, why should they have to learn two syntaxes instead of one? Because DTD have special rules associated with them, such as those relating to case sensitivity, keywords, names, etc. One advantage of having a different syntax is that it is easier to remember that these rules apply when you are using that syntaz, but do not apply when entering XML coded data. ---- Martin Bryan, The SGML Centre, Churchdown, Glos. GL3 2PU, UK Phone/Fax: +44 1452 714029 E-mail: mtbryan@sgml.u-net.com For more information about The SGML Centre contact http://www.u-net.com/~sgml/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From liamquin at interlog.com Fri Oct 3 23:38:30 1997 From: liamquin at interlog.com (Liam Quin) Date: Mon Jun 7 16:58:34 2004 Subject: XML-Data: advantages over DTD syntax? (and some wishes) In-Reply-To: Message-ID: On Fri, 3 Oct 1997, Martin Bryan wrote: > At 19:02 01/10/97 +0200, Jarle Stabell wrote: >> I think the DTDs-as-instances also benefits new users, why should they have >> to learn two syntaxes instead of one? > Because DTD have special rules associated with them, such as those relating > to case sensitivity, keywords, names, etc. One advantage of having a > different syntax is that it is easier to remember that these rules apply > when you are using that syntaz, but do not apply when entering XML coded data. This doesn't follow at all. First, instances are just as case sensitive as DTDs. Second, there are keywords and everywhere... In fact, the distinction is entirely bogus. The question should be, which syntax is more effective. I don't see either as ideal. Charles said that different sorts of information should have different syntaxes, but in the document a chapter title has the same syntax as an author's name, a date, an image... and these things may all have different consumers. If neither syntax is ideal, having only one of them is a definite advantage. Lee -- Liam Quin -- the barefoot typographer -- Toronto lq-text: freely available Unix text retrieval email address: l i a m q u i n, at host: i n t e r l o g dot c o m xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at techno.com Sat Oct 4 00:20:40 1997 From: peter at techno.com (Peter Newcomb) Date: Mon Jun 7 16:58:34 2004 Subject: XML-Data: advantages over DTD syntax? (and some wishes) In-Reply-To: (message from Liam Quin on Fri, 3 Oct 1997 17:38:17 -0400 (EDT)) Message-ID: <199710032215.SAA04585@exocomp.techno.com> [Liam Quin on Fri, 3 Oct 1997 17:38:17 -0400 (EDT)] > On Fri, 3 Oct 1997, Martin Bryan wrote: >> At 19:02 01/10/97 +0200, Jarle Stabell wrote: >>> I think the DTDs-as-instances also benefits new users, why should they have >>> to learn two syntaxes instead of one? >> Because DTD have special rules associated with them, such as those relating >> to case sensitivity, keywords, names, etc. One advantage of having a >> different syntax is that it is easier to remember that these rules apply >> when you are using that syntaz, but do not apply when entering XML coded data. > Charles said that different sorts of information should have different > syntaxes, but in the document a chapter title has the same syntax as an > author's name, a date, an image... and these things may all have different > consumers. Don't forget that full SGML also includes ways of customizing syntax for particular applications: SHORTREF. Instance syntax is good because you always _can_ use it; that doesn't necessarily mean that it's the best syntax to use for any given purpose. Short references at least attempt to fill the void between one-syntax-fits-all and different-syntaxes-for-different-things. -peter -- Peter Newcomb TechnoTeacher, Inc. peter@petes-house.rochester.ny.us peter@techno.com http://www.petes-house.rochester.ny.us http://www.techno.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sat Oct 4 01:42:56 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:34 2004 Subject: XML-Data: advantages over DTD syntax? (and some wishes) References: Message-ID: <3434DA98.D0AD53A4@technologist.com> Liam Quin wrote: > Charles said that different sorts of information should have different > syntaxes, but in the document a chapter title has the same syntax as an > author's name, a date, an image... and these things may all have different > consumers. First, SGML has features that allow a document to define a different syntax. Second, it allows a subsection of a document to use a different syntax. So the SGML standard recognizes that variant syntaxes are important and useful and tries to help. Unfortunately, these features are not powerful enough to represent the full DTD syntax. That does not mean that the principle of choosing the best syntax for the information should be abandoned. Nobody seems to advocate that XSL should require JavaScript code to be encoded as SGML elements or that DSSSL should do the same with expression language code. > If neither syntax is ideal, having only one of them is a definite advantage. We live in an imperfect world. I have never used a language with an ideal syntax. Perhaps binary.... The relevant question is *which syntax is better*? I think that from a reader's and writer's point of view, the current DTD syntax is better. It is more readable and compact. From a *programmers* point of view, the new syntax is better because it allows you to reduce the number of parsers in your system. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Sat Oct 4 06:24:37 1997 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:58:34 2004 Subject: AFs and the DPH References: <3.0.1.16.19971002094027.325769ee@pop3.demon.co.uk> <343356A7.76CF@isogen.com> <34352F47.D0C63C2A@technologist.com> Message-ID: <3435BD28.95424A8C@jclark.com> Paul Prescod wrote: > Also it seems to me that in architectural forms you can *either* get a > single architectural stream, as is the case in the output of Jade, or > you can get multiple fully constructed groves (as in GroveMinder), but I > wonder if it is possible to do stream-based architectural processing of > multiple groves at once? In other words is there a way to build > something like an ESIS that provides multiple groves at once and makes > links between them? Could you comment on what this stream would look > like? I think that it is admirable that archforms work in these two > different modes, but I might like to take advantage of the best features > of both modes at once -- access to all architectural "views" and > stream-based processing. Is this feasible? Yes, it's feasible, and the API to SP's architecture engine (ArcEngine.h) already supports it. You provide an ArcDirector that returns an event handler for each "view" you are interested in (including the original document). For each potentially architectural event (eg start element, data, end element), the architecture engine will make a call on the event handler for each architecture for which that event is architectural followed by a call on the event handler for the original document, thus potentially allowing the event handler for the original document access to all architectural views. This approach depends on the fact that an architecture engine never creates more than one element in the architectural document for each element in the original document. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sat Oct 4 10:02:25 1997 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:58:34 2004 Subject: XML-Data: advantages over DTD syntax? (and some wishes) Message-ID: <199710040807.SAA15276@jawa.chilli.net.au> > From: Paul Prescod > The relevant question is *which syntax is better*? I think that from a > reader's and writer's point of view, the current DTD syntax is better. > It is more readable and compact. From a *programmers* point of view, the > new syntax is better because it allows you to reduce the number of > parsers in your system. But the proposed syntax in XML-data has a different and extended functionality compared to XML's declaration syntax. So rather than having two (notional) simple parsers, you have a single parser that is more complicated. Btw, in XML-data can the element type-declarations be subtyped just like other element types, or are they treated as special? Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Sat Oct 4 10:50:32 1997 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 16:58:35 2004 Subject: XML-Data: advantages over DTD syntax? (and some wishes) Message-ID: <199710040850.JAA16624@GPO.iol.ie> [Paul Prescod] > >The relevant question is *which syntax is better*? I think that from a >reader's and writer's point of view, the current DTD syntax is better. >It is more readable and compact. From a *programmers* point of view, the >new syntax is better because it allows you to reduce the number of >parsers in your system. > Yes it reduces the number of parsers "a good thing (tm)" but isn't the big win of the XML-Data approach is the extra miles per gallon that accrue to all the XML application software? All of a sudden it becomes possible to typeset documentation of schemata by processing them with straight XML typesetting tools. It becomes possible to load schemata into XML databases as first class citizens. XML greppers can grep 'em. XML web harversters can harvest 'em. It becomes possible to contemplate a schema derivation mechanism based on using XLL to "cherry-pick" from a collection of existing schemata. It becomes possible to contemplate using an XML to XML transformation system to transform schemata and then auto-generate the instance transformation. [Insert thousands of other possibilities here]. For me the big win is the simplification it could bring to base XML application development and the sheer intellectual appeal of it. It is a very computer science-ish, Lisp-ish, Dame Ada Lovelace-type, KISS way of looking at things. A grand unifying theory of a sort. XML-Data is very much in tune with some of XML's' design goals (ease of implementation, relative unimportance of minimisation etc.). However it sits uneasily with the all important "XML is SGML" criteria. For this very good reason DTD syntax must be kept. However, if developers implemented DTDs via a transformation to XML-Data..... The XML world would then have the freedom to create new and better syntaxes for schemata safe in the knowledge that todays tools that can process XML will process these new syntactic sugars via transformation to base XML - the mother of all syntaxes. Sean Mc Grath sean@digitome.com Digitome Electronic Publishing http://www.digitome.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Sat Oct 4 10:53:06 1997 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 16:58:35 2004 Subject: XML-Data: advantages over DTD syntax? (and some wishes) Message-ID: <199710040852.JAA17031@GPO.iol.ie> [Rick Jelliffe] > >But the proposed syntax in XML-data has a different and extended >functionality compared to XML's declaration syntax. So rather than having >two (notional) simple parsers, you have a single parser that is more >complicated. Sorry, I don't understand this. Why is the parser more complicated? Sean Mc Grath sean@digitome.com Digitome Electronic Publishing http://www.digitome.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sat Oct 4 11:28:33 1997 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:58:35 2004 Subject: XML-Data: advantages over DTD syntax? (and some wishes) Message-ID: <199710040933.TAA17338@jawa.chilli.net.au> > From: Sean Mc Grath > [Rick Jelliffe] > > > >But the proposed syntax in XML-data has a different and extended > >functionality compared to XML's declaration syntax. So rather than having > >two (notional) simple parsers, you have a single parser that is more > >complicated. > > Sorry, I don't understand this. Why is the parser more complicated? I am assuming that the XML-data parser makes use of the things declared in some way. So it is more complicated than just a simple XML instance parser that only looks at well-formedness. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sat Oct 4 16:08:02 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:35 2004 Subject: XML-Data: advantages over DTD syntax? (and some wishes) References: <199710040850.JAA16624@GPO.iol.ie> Message-ID: <3435A55E.5E345F7@technologist.com> Sean Mc Grath wrote: > Yes it reduces the number of parsers "a good thing (tm)" but isn't the big > win of the XML-Data approach is the extra miles per gallon that > accrue to all the XML application software? To all of the XML application software that cannot parse DTDs. It still comes back to saving software developers from processing two formats. > All of a sudden it becomes possible to typeset documentation > of schemata by processing them with straight XML typesetting tools. You can do this with SGML. Check out Earl Hood's software. Check out Near&Far. Yes, this is *easier* with XML-Data, because you only need one parser instead of two. Nobody has ever disputed that. > It becomes possible to load schemata into XML databases as first class > citizens. XML greppers can grep 'em. XML web harversters can harvest > 'em. You can do this with SGML documents. SGML DTDs are represented in a database by a specialized form of grove created by the parser and grove builer. XML-DATA merely adds a level of processing. The XML-Parser must build an XML grove which is then transformed into the specialized grove. > It becomes possible to contemplate a schema derivation mechanism > based on using XLL to "cherry-pick" from a collection of existing schemata. There are proposals for SGML schema derivations. The question is only whether they use XLL or not. This is fundamentally the same question as whether XML should use instance syntax. > It becomes possible to contemplate using an XML to XML transformation > system to transform schemata and then auto-generate the instance > transformation. I don't know what you mean here, but I confidently predict it can be done with SGML DTDs too. If you use a decent parser like SP, the grove that is built provides access to all of the information in the DTD, in a grove that is optimized for DTD navigation (and thus transformation). > For me the big win is the simplification it could bring to base XML > application development and the sheer intellectual appeal of it. > It is a very computer science-ish, Lisp-ish, Dame Ada Lovelace-type, > KISS way of looking at things. A grand unifying theory of a sort. I share this interest in unifying concepts. I've done instance syntaxes for DTDs myself in the past. I have no problem with the concept it's an obvious one...I just don't think it needs to be, nor should be, the *standard* way for creating DTDs. Users should have the option of the optimized syntax. > The XML world would then have the freedom to create > new and better syntaxes for schemata safe in the knowledge that > todays tools that can process XML will process these new syntactic > sugars via transformation to base XML - the mother of all syntaxes. Sean, we have always had this freedom. Dozens of different projects have taken advantage of it in the past. You are re-describing the current situation in SGML. We can experiment with instance syntax and convert to DTD syntax for compatibility. Making such a converter is about a day's work. Paul Presscod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at techno.com Sat Oct 4 16:10:49 1997 From: peter at techno.com (Peter Newcomb) Date: Mon Jun 7 16:58:35 2004 Subject: XML-Data: advantages over DTD syntax? (and some wishes) In-Reply-To: <199710040850.JAA16624@GPO.iol.ie> (digitome@iol.ie) Message-ID: <199710041356.JAA05561@exocomp.techno.com> [Sean Mc Grath on Sat, 04 Oct 1997 09:22:25 +0100] > However, if developers implemented DTDs via a transformation to XML-Data..... > The XML world would then have the freedom to create > new and better syntaxes for schemata safe in the knowledge that > todays tools that can process XML will process these new syntactic > sugars via transformation to base XML - the mother of all syntaxes. This idea has always been attractive to me; it seems to give everyone the flexibility they want, without imposing unnecessary restrictions. However, unless the transformations are extremely well-designed, XML DTDs generated in this way are not likely to be particularly human-readable (one of the arguments for keeping XML DTDs to begin with). I don't see this as a reason not to use this scheme, I just think that the transformations should be well thought out and well documented, and should have the readability of the generated DTDs as a requirement from the outset. -peter -- Peter Newcomb TechnoTeacher, Inc. peter@petes-house.rochester.ny.us peter@techno.com http://www.petes-house.rochester.ny.us http://www.techno.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sat Oct 4 17:56:54 1997 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:58:35 2004 Subject: XML-Data: advantages over DTD syntax? (and some wishes) Message-ID: <199710041602.CAA28245@jawa.chilli.net.au> > From: Paul Prescod > Sean Mc Grath wrote: > > For me the big win is the simplification it could bring to base XML > > application development and the sheer intellectual appeal of it. > > It is a very computer science-ish, Lisp-ish, Dame Ada Lovelace-type, > > KISS way of looking at things. A grand unifying theory of a sort. One of the reasons LISP has failed is that it does only have one syntax, rather than a declaration syntax intertwined with a function syntax. (Of course, in LISP declarations such as they are functions or special forms that return values, so this is not to criticise, and LISPs usually are weakly typed or untyped.) C (and to a lesser extent C++), on the other hand, have different syntaxes for declarations and functions-- int a; and x = fopen(fp); are different syntaxes. There is no a=integer(); syntax. Even casting is a separate syntax. This difference in syntax is perhaps one of the reasons for the success of C over LISP. Of course, C++ allows member functions for a lot of these operations and brings them into the same syntax as other functions. But at the same time C++ also introduces different syntaxes for IO operations (i.e. >>) which I think seek to emphasize the structure of the text being IOed--in other words even in C++ the designers think that KISS is a goal rather than a principle. I do not like claims that some things are "computer science-ish" because computer science, on its own rationale, is an attempt to gain scientific understanding in the domain of computing. While these attempts will have cultural forms, it is a spurious claim to CS's authority for anyone to say a syntax is more "computer science-ish". Which is the most computer science-ish: Boolean logic, PROLOG, assembler, C, Eiffel or OmniMark? Computer scientists in universities tend to produce small elegant languages because that is all their modest budgets and limited problem domains allow. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sat Oct 4 18:34:48 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:35 2004 Subject: XML-Data: advantages over DTD syntax? (and some wishes) References: <199710041602.CAA28245@jawa.chilli.net.au> Message-ID: <3435C7C7.ABAC342D@technologist.com> Rick Jelliffe wrote: > Computer scientists in universities tend to produce small elegant languages > because that is all their modest budgets and limited problem domains allow. Now we're getting off topic, but that is completely incorrect. Computer scientists produce small languages because small languages present opportunities for reasoning, proof and analysis that are thwarted by large languages. SGML is the perfect example of a language that can be very difficult to prove things about because of (e.g.) minimization. You are correct that being clean from a computer science point of view is only one aspect of a language. SGML threw some of that away with minimization, but made major gains in usability. SGML's grove model is useful in that it unifies DTDs and documents at the "application level" without requiring an identical syntax. That is why it is unconvincing to claim that XML instance syntax for DTDs allows new forms of processing. At the application level, DTDs and instances have been "the same" for quite a while. All that is to be saved is a parser. It all comes back to reducing programmer effort -- which isn't unimportant but is also not the be-all and end-all. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sat Oct 4 21:21:16 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:35 2004 Subject: AFs and the DPH References: <199710032023.QAA04467@exocomp.techno.com> Message-ID: <3435EAAA.9E3719DD@technologist.com> Peter Newcomb wrote: > As far as I know, it's not possible to have an architectural element > without a client (source) element. Okay, I interpreted something Eliot said to mean that given an architectural DTD like this: The existence of OUTER and INNER architectural elements could force the existence of an INTRODUCED1 architectural element as if those elements were directly parsed. The idea seemed bizarre at the time so I should be happy it isn't there. It seemed as if it would be useful for "wrapping" one element in another. Architectural forms make more sense as an application-level inheritance mechanism without it. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sat Oct 4 21:47:32 1997 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:58:35 2004 Subject: Inheritance/defaulting of attributes In-Reply-To: <3435BD28.95424A8C@jclark.com> References: <3.0.1.16.19971002094027.325769ee@pop3.demon.co.uk> <343356A7.76CF@isogen.com> <34352F47.D0C63C2A@technologist.com> Message-ID: <3.0.1.16.19971004203549.2ef756d6@pop3.demon.co.uk> The XLL spec (3.4) reads: >>>Note that many of the attributes may be provided for both the parent linking element and the child locator element. If any such attribute is provided in the linking element but not in a locator element, the value provided in the linking element is to be used in processing the locator element. In other words, the attributes provided in the linking element may serve as defaults for the (possibly many) locator elements. This requires the implementation to provide for LOCATORs (children) to 'inherit' attributes from EXTENDED (parent). In my own DTD I would like to be able to use this philosophy, as in the following simple example of a list of numbers, all of type FLOAT 1.2 2.3 can be abbreviated to 1.2 2.3 Is this seen as a sufficiently general mechanism in XML that it is worth creating a DTD-independent engine for this? If so, is there a general mechanism for indicating (in the document instance or DTD) that this operation is to be carried out? At present it is only prose in the XLL spec? It might be preferable to have a syntactic stement of this requirement - do otehrs feel the same? Can AFs be used? [At present I have hardcoded the XLL prose into JUMBO but I don't feel happy about it, especially if the spec changes in the future. In some circumstances I imagine that entities could achieve some simplification, but they don't allow a child attribute value to override a parent value.] P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From elm at arbortext.com Sun Oct 5 00:16:59 1997 From: elm at arbortext.com (Eve L. Maler) Date: Mon Jun 7 16:58:35 2004 Subject: Inheritance/defaulting of attributes Message-ID: <3.0.32.19971004181649.00a9c290@village.doctools.com> Inheritance/defaulting of attributes is a quite common thing in SGML (and so presumably it will be common in XML). In fact, it's common enough that a lot of people wish that SGML offered an attribute default value of #INHERITED, as a more specific form of #IMPLIED. In some DTDs, I've defined a parameter entity %inherited; that resolves to #IMPLIED, as a kind of self-documentation of intent. I think it might get a little tricky, though, to nail down precisely what the defaulting behavior should be if it's formalized in XML. When you inherit a value from an ancestor, the attribute name is assumed the same across both element types. But should they have the same declared and default types? Is it an error if they don't? In fact, what this comes down to is needing some sort of agreement over what attributes are "the same," even though they're on different element types. (Hmm, a subject that has been broached in the namespace discussion on the SIG list recently...) Eve At 04:35 PM 10/4/97 -0400, Peter Murray-Rust wrote: >The XLL spec (3.4) reads: > >>>>Note that many of the attributes may be provided for both the parent >linking element and the child locator element. If any such attribute is >provided in the linking element but not in a locator element, the value >provided in the linking element is to be used in processing the locator >element. In other words, the attributes provided in the linking element may >serve as defaults for the (possibly many) locator elements. > >This requires the implementation to provide for LOCATORs (children) to >'inherit' attributes from EXTENDED (parent). In my own DTD I would like to >be able to use this philosophy, as in the following simple example of a >list of numbers, all of type FLOAT > > > 1.2 > 2.3 > > > >can be abbreviated to > > > 1.2 > 2.3 > > > > >Is this seen as a sufficiently general mechanism in XML that it is worth >creating a DTD-independent engine for this? If so, is there a general >mechanism for indicating (in the document instance or DTD) that this >operation is to be carried out? At present it is only prose in the XLL >spec? It might be preferable to have a syntactic stement of this >requirement - do otehrs feel the same? Can AFs be used? > >[At present I have hardcoded the XLL prose into JUMBO but I don't feel >happy about it, especially if the spec changes in the future. In some >circumstances I imagine that entities could achieve some simplification, >but they don't allow a child attribute value to override a parent value.] xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sun Oct 5 08:18:42 1997 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:58:35 2004 Subject: Inheritance/defaulting of attributes Message-ID: <199710050623.QAA12001@jawa.chilli.net.au> > From: Peter Murray-Rust > This requires the implementation to provide for LOCATORs (children) to > 'inherit' attributes from EXTENDED (parent). In my own DTD I would like to > be able to use this philosophy, as in the following simple example of a > list of numbers, all of type FLOAT > Is this seen as a sufficiently general mechanism in XML that it is worth > creating a DTD-independent engine for this? If so, is there a general > mechanism for indicating (in the document instance or DTD) that this > operation is to be carried out? At present it is only prose in the XLL > spec? It might be preferable to have a syntactic stement of this > requirement - do otehrs feel the same? Can AFs be used? In your particular example, it can already be done using XLL, with only a small change in the invocation syntax. The VAR elements have a link to the first conatining parent element with a role "datatype". The LIST element has an XLL href (remapped to the name "type") that links to the approporiate type element. The TYPES container has empty elements for all the different types you care to make. An example of these is the float type, which in turn has a pointer to some defining documentation. Perhaps what you need can best be dealt with by defining some standard values for the XLL role attribute! (See the values for role= below for candidates.) I think the standard definition of some role types (i.e. moving "role" into more certain architecture") would be highly useful--if no common values are defined, role will go the way of PIs and languish in unuseability due to meaninglessness. ]> 1.2 2.3 Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sun Oct 5 12:37:59 1997 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:58:35 2004 Subject: Inheritance/defaulting of attributes In-Reply-To: <3.0.32.19971004181649.00a9c290@village.doctools.com> Message-ID: <3.0.1.16.19971005112008.2ef76b34@pop3.demon.co.uk> Thanks Eve, At 18:16 04/10/97 -0400, Eve L. Maler wrote: >Inheritance/defaulting of attributes is a quite common thing in SGML (and >so presumably it will be common in XML). In fact, it's common enough that >a lot of people wish that SGML offered an attribute default value of >#INHERITED, as a more specific form of #IMPLIED. In some DTDs, I've >defined a parameter entity %inherited; that resolves to #IMPLIED, as a kind >of self-documentation of intent. > >I think it might get a little tricky, though, to nail down precisely what >the defaulting behavior should be if it's formalized in XML. When you >inherit a value from an ancestor, the attribute name is assumed the same >across both element types. But should they have the same declared and Yes - I agree. In XLL EXTENDED has (essentially) the same attributes as its child LOCATOR. I assume this is so that LOCATOR can inherit them from EXTENDED, rather than that EXTENDED actually needs them itself? [Please disabuse me if not - but I couldn't see an *obvious* separate use for, say, ACTUATE on EXTENDED other than to pass it to its children.] My own case is worse in that I have defined since its role is simply to act as a container. Therefore *with a DTD* it would have to anticipate all the attributes of its children and those that it might wish to pass on to them. Without a DTD [more accurately, without wishing to validate the whole document against a single DTD] this is less of a restriction, but there still has to be a non-prose mechanism to pass the attributes on to the children. (Just as we have XML-ATTRIBUTE for mapping, we could ask the ERB for some inheritance mechanism :-) >default types? Is it an error if they don't? In fact, what this comes >down to is needing some sort of agreement over what attributes are "the >same," even though they're on different element types. (Hmm, a subject >that has been broached in the namespace discussion on the SIG list >recently...) An alternative - which I have only just read - could be the mechanism in the RDF spec (I assume this is public now and can be discussed on this list?). Since one can define a relationship between nodes, it would be possible for this list to suggest (or define) such relationships. I am not sure how generic this is, or whether it has to be done for each instance... P. > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sun Oct 5 20:54:11 1997 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:58:35 2004 Subject: Inheritance/defaulting of attributes Message-ID: <3.0.1.16.19971005190922.383f289c@pop3.demon.co.uk> [... original posting from PeterMR remailed...] > >Rick, > That's very clever! It's an attractive, generic way to tackle the problem. > Its downside is exactly what you point out. > >At 16:19 05/10/97 +1000, you wrote: >[...] > >>Perhaps what you need can best be dealt with by defining some >>standard values for the XLL role attribute! (See the values for >>role= below for candidates.) I think the standard definition >>of some role types (i.e. moving "role" into more certain >>architecture") would be highly useful--if no common values are >>defined, role will go the way of PIs and languish in unuseability >>due to meaninglessness. >> >[... clever example snipped ...] > >I agree strongly with these sentiments. XML and XLL are both defined in a very abstract manner. No problem with this in principle, but it must be complemented with: > - tutorials > - examples > - software >and as we both agree > - common methods of tackling frequent problems. > >I agree with you that unless PIs and roles are addressed very soon, they will become useless. Their use is totally unclear to anyone from outside the SGML community and so newcomers won't use them - just as they don't use REL in HTML. > >As a webhacker I don't have the insight into the tools and tricks of the current SGML community :-) but appreciate what can be done with examples like yours and AFs. However unless there is a consistent approach to promoting *how XML can be best used* we are likely to suffer from fragmentation and inconsistency. > >I have consistently asked for the developers to show *how* X*L may be used and *what software is required to make it work that way*. Your example shows two levels of indirection using SIMPLE links with default attribute values. The software has to be able to pick up that, although the attributes are [SHOW="EMBED" ACTUATE="USER"], the nodes shouldn't have clickable buttons. This is presumably governed by the values of the roles. IOW with each of your roles, quite a lot of software has to be written. > >The problem with your solution is that it is completely impossible to 'sell' to my community [at present]! (I'm not singling you out - it's true of AFs and all the clever stuff that can be done with remapping, indirection, etc.} I'm probably in a minority, but I'm still clinging to the idea that 'XML is simple'. > >So let me make a plea for those on this list who understand this to suggest some concrete aspects for X*L. Unlike the XML-SIG and ERB (who have to produce a spec and who are clear that this must be as general as possible), we have the opportunity to make concrete suggestions about the best ways that XML might be used. This doesn't stop anyone doing other things differently, but it might stop us ending in a mess. We *have* been able to propose an API for XML (no one is *forced* to use it) and this would seem to have the same importance. > >If nothing is done, here are some scenarios... > - the browser manufacturers produce tools that ipso facto define how X*L is to be used. Different offerings will almost certainly be incompatible, terminology will clash and the power of the language will be lost. > - SGML vendors will produce complex tools and architectures and tell the users that they are easy to learn and use. My experience is that they won't be. > - a few applications (RDF will probably be the first) will capture the world's imagination. People will start hacking without understanding XML, just as with HTML. [BTW I approve of the RDF spec - I assume it's now OK to discuss it? I assume that my current example could also have been done in RDF.] > - lots of people will produce XML applications in limited domains which will be largely incompatible in their philosophy. > >X*L is vastly more powerful than HTML and significantly more difficult. I suggest the following as an analogy. When I learnt FORTRAN there were an infinite number of ways of writing programs, many very bad. With C, there was a bible, but still a fragmentation of style. Slowly, supporting libraries became available which helped to support some of the common ways of doing things. With C++ there was a greater emphasis on common style but (big mistake) no libraries. It was difficult in the early days to write portable applications. With Java, there is a strong consensus of style - not forced, but voluntarily accepted - so it's fairly straightforward to read someone else's code. The delight - for me - is the enormous library of classes that comes with it. I no longer have to implement things like dates, reading files, hacking images, etc. but I can also be sure that what I do interfaces directly with other people. > >We need the equivalent philosophy for X*L. At present it's abstract. It can be implemented in many different ways. The right time to tackle this problem is before there are concrete applications out there - it seems an appropriate role for XML-DEV to suggest best ways of doing things. > > P. > > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ht at cogsci.ed.ac.uk Mon Oct 6 10:45:46 1997 From: ht at cogsci.ed.ac.uk (Henry S. Thompson) Date: Mon Jun 7 16:58:35 2004 Subject: XML-Data: advantages over DTD syntax? (and some wishes) In-Reply-To: Peter Newcomb's message of Sat, 4 Oct 1997 09:56:46 -0400 References: <199710041356.JAA05561@exocomp.techno.com> Message-ID: <2079.199710060845@grogan.cogsci.ed.ac.uk> Peter writes: > [Sean Mc Grath on Sat, 04 Oct 1997 09:22:25 +0100] > > However, if developers implemented DTDs via a transformation to XML-Data..... > > > The XML world would then have the freedom to create > > new and better syntaxes for schemata safe in the knowledge that > > todays tools that can process XML will process these new syntactic > > sugars via transformation to base XML - the mother of all syntaxes. > > This idea has always been attractive to me; it seems to give everyone > the flexibility they want, without imposing unnecessary restrictions. > > However, unless the transformations are extremely well-designed, XML > DTDs generated in this way are not likely to be particularly > human-readable (one of the arguments for keeping XML DTDs to begin > with). > > I don't see this as a reason not to use this scheme, I just think that > the transformations should be well thought out and well documented, > and should have the readability of the generated DTDs as a requirement > from the outset. Speaking only for myself, I strongly endorse this view, am committed to it myself, and believe it is the best recipe for sanity available during what I see as the experimental phase of schema development we are now embarking on. ht xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From drodrigues at synervisiondpi.com Mon Oct 6 16:01:43 1997 From: drodrigues at synervisiondpi.com (Declan Rodrigues) Date: Mon Jun 7 16:58:36 2004 Subject: Unsubscribe Message-ID: <3438EF37.BC523659@synervisiondpi.com> unsubscribe -------------- next part -------------- A non-text attachment was scrubbed... Name: vcard.vcf Type: text/x-vcard Size: 421 bytes Desc: Card for Declan Rodrigues Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19971006/4457ec63/vcard.vcf From sylee at hanul.edi.co.kr Mon Oct 6 17:05:26 1997 From: sylee at hanul.edi.co.kr (sang yeol Lee) Date: Mon Jun 7 16:58:36 2004 Subject: Unsubscribe Message-ID: <3439365B.C4E4D406@hanul.edi.co.kr> unsubscribe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Oct 6 19:06:52 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:36 2004 Subject: Inheritance/defaulting of attributes References: <3.0.1.16.19971002094027.325769ee@pop3.demon.co.uk> <343356A7.76CF@isogen.com> <34352F47.D0C63C2A@technologist.com> <3.0.1.16.19971004203549.2ef756d6@pop3.demon.co.uk> Message-ID: <3435F9DF.16FE49@technologist.com> Peter Murray-Rust wrote: > Is this seen as a sufficiently general mechanism in XML that it is worth > creating a DTD-independent engine for this? This semantic can be expressed in the stylesheet language: (element p (make paragraph font-color: (inherited-attribute-string "COLOR"))) I think that this is valid DSSSL code. I don't see much benefit in moving this semantic into the markup language itself. There are others I would add before this. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Oct 6 19:07:27 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:36 2004 Subject: XML Support in IE 4.0 Message-ID: <3436FB7D.D3967C33@technologist.com> "Microsoft, ArborText and The Wall Street Journal Interactive Edition teamed up today to show the XML support available in Internet Explorer 4.0. In a keynote demonstration, actual data from The Wall Street Journal Interactive Edition, delivered using ArborText's ADEPT Editor software, was shown on Microsoft Internet Explorer 4.0." Could somebody please describe what was shown. Does IE 4.0 allow the display of XML documents? Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jwrobie at mindspring.com Mon Oct 6 19:15:51 1997 From: jwrobie at mindspring.com (Jonathan Robie) Date: Mon Jun 7 16:58:36 2004 Subject: XML Support in IE 4.0 Message-ID: <1.5.4.32.19971006171523.00a164e8@pop.mindspring.com> At 10:29 PM 10/4/97 -0400, Paul Prescod wrote: >"Microsoft, ArborText and The Wall Street Journal Interactive Edition >teamed up today to show the XML support available in Internet Explorer >4.0. In a keynote demonstration, actual data from The Wall Street >Journal Interactive Edition, delivered using ArborText's ADEPT Editor >software, was shown on Microsoft Internet Explorer 4.0." > >Could somebody please describe what was shown. Does IE 4.0 allow the >display of XML documents? It supports CDF, which is implemented using XML, and which is what it uses to deliver things like The Wall Street Journal. I do not believe that it has any support for vanilla XML. Jonathan *************************************************************************** Jonathan Robie jwrobie@mindspring.com http://www.mindspring.com/~jwrobie POET Software, 3207 Gibson Road, Durham, N.C., 27703 http://www.poet.com *************************************************************************** xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Mon Oct 6 19:17:41 1997 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:58:36 2004 Subject: XML Support in IE 4.0 Message-ID: <3.0.32.19971006101416.008493d0@pop.intergate.bc.ca> At 10:29 PM 04/10/97 -0400, Paul Prescod wrote: >Could somebody please describe what was shown. Does IE 4.0 allow the >display of XML documents? My understanding is that IE 4 has 1. CDF processing (it's XML) 2. a couple of XML parsers, one in C, one in Java - MSXML, for which they'll provide source 3. an XML object model API to parsed XML docs 4. an XML Data Source Object. DSOs are things in MS's "Dynamic HTML" implementation whereby you can drive an HTML display with a data source on the client. SO the idea would be that you could send some XML to the client and energize a portion of the display with it. It should be noted that the Netscape and Microsoft implementations of "Dynamic HTML" are violently incompatible which is why there is a W3C DOM activity that is about to start emitting specs giving a portable way to do the same things. Anyhow; IE4 doesn't have direct/native XML display yet. However, Given the fact that MS co-authored the XSL proposal, I think we can assume that the commercial browser marketplace is hip to the idea that displaying XML would be A Good Thing. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From michaele at MICROSOFT.com Mon Oct 6 19:29:20 1997 From: michaele at MICROSOFT.com (Michael Edwards) Date: Mon Jun 7 16:58:36 2004 Subject: XML Support in IE 4.0 Message-ID: I believe the demo utilized the XML Data Source Object (DSO) that ships with IE4. A DSO is an object that provides data which can be bound to HTML elements on a Web page through scripting. The XML DSO is a Java applet in the com.ms.xml.dso.XMLDSO.class (shipped with IE4, or as part of the final Microsoft 2.0 Java SDK which will release imminently). You can read more about this in the Internet Client SDK on MSDN Library Online: http://www.microsoft.com/msdn/sdk/inetsdk/help/dhtml/databind.htm#ch_dat abind_xml_intro > ---------- > From: Paul Prescod[SMTP:papresco@technologist.com] > Reply To: Paul Prescod > Sent: Saturday, October 04, 1997 7:29 PM > To: xml-dev > Subject: XML Support in IE 4.0 > > "Microsoft, ArborText and The Wall Street Journal Interactive Edition > teamed up today to show the XML support available in Internet Explorer > 4.0. In a keynote demonstration, actual data from The Wall Street > Journal Interactive Edition, delivered using ArborText's ADEPT Editor > software, was shown on Microsoft Internet Explorer 4.0." > > Could somebody please describe what was shown. Does IE 4.0 allow the > display of XML documents? > > Paul Prescod > > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Mon Oct 6 19:41:36 1997 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 16:58:36 2004 Subject: XML Support in IE 4.0 Message-ID: <7BB61B44F197D011892800805FD4F792018006C9@RED-03-MSG.dns.microsoft.com> Internet Explorer 4.0 ships with a high-performance XML parser DLL written in C++ and a parser written in Java. Both parse any well-fromed XML. IE4 also ships with an XML data source control (Java) which reads XML and binds it into an HTML page. I believe that both Java items include full source code. The Wall Street Journal demonstration was not based on CDF but on the Java XML parser and data source control. --Andrew Layman AndrewL@microsoft.com > -----Original Message----- > From: Jonathan Robie [SMTP:jwrobie@mindspring.com] > Sent: Monday, October 06, 1997 10:15 AM > To: Paul Prescod > Cc: xml-dev > Subject: Re: XML Support in IE 4.0 > > At 10:29 PM 10/4/97 -0400, Paul Prescod wrote: > >"Microsoft, ArborText and The Wall Street Journal Interactive Edition > >teamed up today to show the XML support available in Internet > Explorer > >4.0. In a keynote demonstration, actual data from The Wall Street > >Journal Interactive Edition, delivered using ArborText's ADEPT Editor > >software, was shown on Microsoft Internet Explorer 4.0." > > > >Could somebody please describe what was shown. Does IE 4.0 allow the > >display of XML documents? > > It supports CDF, which is implemented using XML, and which is what it > uses > to deliver things like The Wall Street Journal. I do not believe that > it has > any support for vanilla XML. > > Jonathan > > ********************************************************************** > ***** > Jonathan Robie jwrobie@mindspring.com > http://www.mindspring.com/~jwrobie > POET Software, 3207 Gibson Road, Durham, N.C., 27703 > http://www.poet.com > ********************************************************************** > ***** > > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jwrobie at mindspring.com Mon Oct 6 19:50:03 1997 From: jwrobie at mindspring.com (Jonathan Robie) Date: Mon Jun 7 16:58:36 2004 Subject: XML Support in IE 4.0 Message-ID: <1.5.4.32.19971006174926.00a05eb0@pop.mindspring.com> At 10:40 AM 10/6/97 -0700, Andrew Layman wrote: >Internet Explorer 4.0 ships with a high-performance XML parser DLL >written in C++ and a parser written in Java. Both parse any well-fromed >XML. IE4 also ships with an XML data source control (Java) which reads >XML and binds it into an HTML page. I believe that both Java items >include full source code. Is there any way to display generic XML using Internet Explorer 4.0? If not, what are the missing links? Jonathan *************************************************************************** Jonathan Robie jwrobie@mindspring.com http://www.mindspring.com/~jwrobie POET Software, 3207 Gibson Road, Durham, N.C., 27703 http://www.poet.com *************************************************************************** xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Mon Oct 6 19:59:14 1997 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 16:58:36 2004 Subject: XML Support in IE 4.0 Message-ID: <7BB61B44F197D011892800805FD4F792018006CB@RED-03-MSG.dns.microsoft.com> One of the important advantages of XML is that it separates data from display. Consequently, other than displaying the raw XML text, there may be no single display format appropriate for a particular document. However, one way to display XML data is to use the parser or data source control shipping with IE4 (see http://www.microsoft.com/msdn/sdk/inetsdk/help/dhtml/databind.htm#ch_dat , though you'll need IE4 to look at the samples, get it at http://www.microsoft.com/ie/download/). Another future possibility might be the XMS Style Sheet idea that the W3C is working on. This is an experimental technology that appears to have promise. --Andrew Layman AndrewL@microsoft.com > -----Original Message----- > From: Jonathan Robie [SMTP:jwrobie@mindspring.com] > Sent: Monday, October 06, 1997 10:49 AM > To: Andrew Layman > Cc: xml-dev > Subject: RE: XML Support in IE 4.0 > > At 10:40 AM 10/6/97 -0700, Andrew Layman wrote: > >Internet Explorer 4.0 ships with a high-performance XML parser DLL > >written in C++ and a parser written in Java. Both parse any > well-fromed > >XML. IE4 also ships with an XML data source control (Java) which > reads > >XML and binds it into an HTML page. I believe that both Java items > >include full source code. > > Is there any way to display generic XML using Internet Explorer 4.0? > If not, > what are the missing links? > > Jonathan > > ********************************************************************** > ***** > Jonathan Robie jwrobie@mindspring.com > http://www.mindspring.com/~jwrobie > POET Software, 3207 Gibson Road, Durham, N.C., 27703 > http://www.poet.com > ********************************************************************** > ***** xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Tue Oct 7 00:16:16 1997 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 16:58:36 2004 Subject: XML Support in IE 4.0 Message-ID: <7BB61B44F197D011892800805FD4F792018006D7@RED-03-MSG.dns.microsoft.com> OOOps. Spelling error on my part. Yes: The XSL style sheets. --Andrew Layman AndrewL@microsoft.com > -----Original Message----- > From: Sharon Adler [SMTP:sca@eps.inso.com] > Sent: Monday, October 06, 1997 2:07 PM > To: Andrew Layman > Subject: RE: XML Support in IE 4.0 > > At 10:57 AM 10/6/97 -0700, you wrote: > >One of the important advantages of XML is that it separates data from > >display. Consequently, other than displaying the raw XML text, there > may > >be no single display format appropriate for a particular document. > >However, one way to display XML data is to use the parser or data > source > >control shipping with IE4 (see > >http://www.microsoft.com/msdn/sdk/inetsdk/help/dhtml/databind.htm#ch_ > dat > >, though you'll need IE4 to look at the samples, get it at > >http://www.microsoft.com/ie/download/). > > > >Another future possibility might be the XMS Style Sheet idea that the > >W3C is working on. This is an experimental technology that appears to > >have promise. > Andrew, > > Do you really mean XMS Style Sheet idea? If so what is it? How does > it > relate to XSL that we are all working on? > > Sharon > > > > > >--Andrew Layman > > AndrewL@microsoft.com > > > >> -----Original Message----- > >> From: Jonathan Robie [SMTP:jwrobie@mindspring.com] > >> Sent: Monday, October 06, 1997 10:49 AM > >> To: Andrew Layman > >> Cc: xml-dev > >> Subject: RE: XML Support in IE 4.0 > >> > >> At 10:40 AM 10/6/97 -0700, Andrew Layman wrote: > >> >Internet Explorer 4.0 ships with a high-performance XML parser DLL > >> >written in C++ and a parser written in Java. Both parse any > >> well-fromed > >> >XML. IE4 also ships with an XML data source control (Java) which > >> reads > >> >XML and binds it into an HTML page. I believe that both Java items > >> >include full source code. > >> > >> Is there any way to display generic XML using Internet Explorer > 4.0? > >> If not, > >> what are the missing links? > >> > >> Jonathan > >> > >> > ********************************************************************** > >> ***** > >> Jonathan Robie jwrobie@mindspring.com > >> http://www.mindspring.com/~jwrobie > >> POET Software, 3207 Gibson Road, Durham, N.C., 27703 > >> http://www.poet.com > >> > ********************************************************************** > >> ***** > > > >xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > >(un)subscribe xml-dev > >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > >subscribe xml-dev-digest > >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) > > > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Tue Oct 7 04:55:55 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:36 2004 Subject: XML Support in IE 4.0 References: <7BB61B44F197D011892800805FD4F792018006C9@RED-03-MSG.dns.microsoft.com> Message-ID: <3439A522.C4F85B5F@technologist.com> Andrew Layman wrote: > > Internet Explorer 4.0 ships with a high-performance XML parser DLL > written in C++ and a parser written in Java. Both parse any well-fromed > XML. IE4 also ships with an XML data source control (Java) which reads > XML and binds it into an HTML page. I believe that both Java items > include full source code. That seems like neat stuff. Can authors get access to the parser error messages to build pages that parse XML documents and validate them? I know that your parser's a little out of date (considering how recently the spec. has changed!) but once it is updated will this be possible? It would be nice to have a validating parser deployed as part of a Big Two browser. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Tue Oct 7 22:36:03 1997 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 16:58:36 2004 Subject: XML Support in IE 4.0 Message-ID: <7BB61B44F197D011892800805FD4F79201800702@RED-03-MSG.dns.microsoft.com> The two parsers had very different goals. The C++ parser was built with performance as its main target. It does not include validation. The Java parser is a fully validating parser and ships with source code (and we hope to ship a new one shortly that reflects recent changes in the specifications). --Andrew Layman AndrewL@microsoft.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From clovett at microsoft.com Wed Oct 8 01:54:52 1997 From: clovett at microsoft.com (Chris Lovett) Date: Mon Jun 7 16:58:36 2004 Subject: XML Support in IE 4.0 Message-ID: <41135C785691CF11B73B00805FD4D2D703C00A97@RED-17-MSG.dns.microsoft.com> The Java SDK 2.0 final release is now available from http://www.microsoft.com/java and this includes documentation of the latest version of the Java parser that is included in IE 4.0. In particular you will want to look at the com.ms.xml.om and com.ms.xml.parser packages. The Java parser throws an exception when it finds a problem and the exception includes information that you could display to the user. We even have a standard way of describing the error, for example, the following is a typical error you might get: Close tag WOOPS does not match start tag TREE Location: file:/d:/java/msxml/foo.xml(7,25) Context: Location shows the file that contains the error, then the line number then the character position on that line, so in this example it's line 7 character position 25. The Context will list all the tags from the scope of the error out to the root of the document. In this case the error is inside the element which is inside the root element. So yes, you could write an authoring tool using this approach. > -----Original Message----- > From: Andrew Layman > Sent: Tuesday, October 07, 1997 1:36 PM > To: xml-dev > Subject: RE: XML Support in IE 4.0 > > The two parsers had very different goals. The C++ parser was built > with > performance as its main target. It does not include validation. The > Java parser is a fully validating parser and ships with source code > (and > we hope to ship a new one shortly that reflects recent changes in the > specifications). > > --Andrew Layman > AndrewL@microsoft.com > > > xml-dev: A list for W3C XML Developers. To post, > mailto:xml-dev@ic.ac.uk > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; > (un)subscribe xml-dev > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following > message; > subscribe xml-dev-digest > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From andrewl at microsoft.com Wed Oct 8 02:29:53 1997 From: andrewl at microsoft.com (Andrew Layman) Date: Mon Jun 7 16:58:36 2004 Subject: Inheritance/defaulting of attributes Message-ID: <7BB61B44F197D011892800805FD4F79201800710@RED-03-MSG.dns.microsoft.com> I decided to learn about inheritance by talking to people in a number of different groups at Microsoft, ranging from products to research, from programming languages to databases to knowledge representation. I wish I had a good answer for you. I don't. Instead, I found that for every single behavior of inheritance (a) some people can give a very reasonable justification for it and (b) other people can give an equally reasonable justification for why a different behavior is needed. We are seeing that on this list also. Clearly inheritance is going to be difficult to work out. --Andrew Layman AndrewL@microsoft.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Wed Oct 8 02:51:48 1997 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:58:36 2004 Subject: Inheritance/defaulting of attributes Message-ID: <3.0.32.19971007174626.0099f8f0@pop.intergate.bc.ca> At 05:25 PM 07/10/97 -0700, Andrew Layman wrote: >We are seeing that on this list also. Clearly inheritance is going to >be difficult to work out. I'm really glad Andrew posted that. I have heard too many people, not just on this list, saying things like |-parameter entities | |-architectural forms | "Well, of course-<--DTDs >--support O-O inheritence." |-HyTime | |-Hypertext in general | |-Document schemas | After years of struggle, we've got a pretty good handle on how to do some useful kinds of inheritence with reasonable safety, in the software module world. Documents are different. To start with, it's far from clear to me that the term "inheritance," standing by itself, has a useful semantic in the context of documents. This has a practical consequence. We should, in future, struggle for greater precision; for example, however it is clear that however LOCATOR attributes depend on those in the parent EXTENDED linking element, it ain't inheritance in the O-O style. -Tim xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From cbullard at hiwaay.net Wed Oct 8 05:36:03 1997 From: cbullard at hiwaay.net (len bullard) Date: Mon Jun 7 16:58:36 2004 Subject: Inheritance/defaulting of attributes References: <7BB61B44F197D011892800805FD4F79201800710@RED-03-MSG.dns.microsoft.com> Message-ID: <343AFF6E.802@hiwaay.net> Andrew Layman wrote: > > I decided to learn about inheritance by talking to people in a number of > different groups at Microsoft, ranging from products to research, from > programming languages to databases to knowledge representation. > > I wish I had a good answer for you. I don't. > > Instead, I found that for every single behavior of inheritance (a) some > people can give a very reasonable justification for it and (b) other > people can give an equally reasonable justification for why a different > behavior is needed. > > We are seeing that on this list also. Clearly inheritance is going to > be difficult to work out. > > --Andrew Layman > AndrewL@microsoft.com If two groups express different requirements for inheritance, then the system should support different requirements for inheritance. Should inheritance be a facility of XML, or are inheritance requirements met else where in the design. Early in this list, object-oriented ideas were discussed and rejected(? - assumption) for XML. By analogy, SGML tools do not explicitly support or express inheritance. But object-oriented databases have been created that use SGML. Where in those designs is inheritance explicitly supported? Len Bullard Intergraph Public Safety xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Wed Oct 8 05:57:14 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:36 2004 Subject: Inheritance/defaulting of attributes References: <3.0.32.19971007174626.0099f8f0@pop.intergate.bc.ca> Message-ID: <343B03D2.F1918B05@technologist.com> Tim Bray wrote: > Documents are different. To start with, it's far from clear to me > that the term "inheritance," standing by itself, has a useful > semantic in the context of documents. I agree. Inheritance in OOP only had meaning after it was defined. Still, if we are talking about "element type inheritance" I think the obvious meaning would be that element sub-types would "inherit" "properties" (attributes and content model) from super-types. > This has a practical consequence. We should, in future, struggle > for greater precision; for example, however it is clear that > however LOCATOR attributes depend on those in the parent EXTENDED > linking element, it ain't inheritance in the O-O style. -Tim Actually, in OOP that is called "acquisition". It is evolving a language and understanding of its own. See: Environmental Acquisition - A New Inheritance Mechanism http://www.cs.technion.ac.il/~david/Papers/Tech_Reports/lpcr9507.PS.gz in: http://www.cs.technion.ac.il/~david/publications.html So we are not in completely uncharted waters although there will certainly be some differences. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Wed Oct 8 05:58:01 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:37 2004 Subject: Subclassing and Inheritance in generic documents Message-ID: <343B04ED.60A3416F@technologist.com> This is something I wrote in response to clarify my own ideas about inheritance, subclassing, architectural forms and SGML documents. I hope it is useful to others and will make it into a web page if it seems useful to do so. Please ignore typos. Conversation on the document should be in XML-DEV (or in private email) since inheritance is not on the XML WG's immediate agenda (AFAIK). Thanks to Steven Newcomb and Peter Newcomb for their individual comments on some of the ideas. This does *not* represent an expression of concensus however. :) ---- Ideas about Subclassing and Inheritance in generic documents ============================================================== The concept of subclassing is an important one in modern object oriented software design, but the idea of "subclasses" and "superclasses" is fundamental to how we think about our world. These concepts are reinvented in DTD after DTD through parameter entities and deserve a direct expression that could be used throughout an SGML authoring and processing system. There is a lesser notion, of inheritance, which has practical utility but is not as crucial. Inheritance allows one set of objects to get the properties of another set of object "for free". In SGML this would probably mean that one element would "inherit" the attribute declarations and content model of another element. SGML DTDs also use parameter entities to model inheritance. Inheritance is basically just a code saving device which allows different parts of the DTD to evolve independently. The concept of inheritance is not as old or well established as that of subclassing and I do not think that it is as important in SGML. Inheritance is only a code saving device. Subclassing has implications that can allow us to build more interesting editors, processors and other tools. Code saving is important, but not as important as being allowed to express new kinds of type relationships naturally. Programming language research has found that there are times when you want to subclass without inheriting and even inherit without subclassing. Early object oriented programming languages did not make a distinction. Modern programming languages give you some options in this regard. C++ allows "private" inheritance which is inheritance ("inheriting code") without subclassing (claiming to always be replaceable for elements of the parent class). Java's "interfaces" allows subclassing without inheritance. I believe that the problems that have led to this need for a separation are all related to multiple inheritance and are also applicable to markup languages. More on this later. This essay will outline the ideas without making concrete proposals for extensions for SGML. The goal is to get people thinking and to expose the flaws in our current mechanisms, parameter entities and architectural forms. A Straw Person Syntax ===================== For our purposes, an element type is a subclass of another element type if it is substitutable where ever the first element is and declared to be a subclass. So if I have an animal element type, cat would be an appropriate subclass because any content model that needs an animal could be fulfilled with a cat. Here's an example of a syntax that would allow this: Now there are a few important things buried in this syntax. Subclassing =========== The first is that the subclass must implement the base class's interface faithfully. Since animal has ANY content model, cat can have ANY content model. But cat may also have a more restricted content model. In this case they do. Cats may only have text content. Animal allows food to be any string of SGML characters, but cat restricts the set to only the name characters. One important thing to note is that before a single document has been created, this DTD can be checked to guarantee complete conformance of element type subclasses to element type superclasses so that any element of type "cat" that is created is absolutely guaranteed to be also valid as an "animal". This means that cats are guaranteed to be substituable for animals. So this is now a valid document: %animal; ]> It would be illegal for "cat" to extend "HABITAT" to type CDATA or NUMBER because there exist CDATA strings and NUMBERs that are not valid NAMEs. This is a little bit of an inversion from OOP, because in OOP a subclass must accept any "input" that a parent class can. We think of content models as "accepting input". The reason for this inversion is that the content of an element is actually "output", not input. It is input by the human, but output by the parser. Think of each element as an object that implicitly has methods to retrieve each attribute and its content. For instance in Python I might make a rule for animals that absolutely depended on HABITAT being a name, and not random CDATA: def ANIMAL( FOOD, HABITAT ): print "I eat", Cdata2String( FOOD ) print "I live in", Name2String( HABITAT ) Now you can see why it is so important that cat not be more "flexible" about the properties it has in common with animal. This would be even more vital if the property were declared to be a number and I were going to use the language's built-in coercion tools to convert the string representation of the number to an integer. More generally, an important reason that SGML has DTDs is so that software can depend on documents conforming to them. We cannot weaken those guarantees in moving adding inheritance and subclassing to SGML. On the other hand, cat could create new attributes that would be totally ignored when it was being treated as an animal. Generally speaking, attributes seem more intrinsically amenable to concepts of subclassing than content, because they are "random access" in some sense, as are methods in OOP. Perhaps in adding OO features to SGML we will also choose to make attributes more powerful (for instance by allowing them to have content models and explicit substructure like elements). Substitutability also works in the DTD. That means that a subclass of ZOO could have a content model that was restricted to subclasses of ANIMAL: %zoo; ]> The flea-circus element type is a special subclass of zoo that can only accept fleas. We say that each element type listed in a content model has a "role". When you create an SGML document, each element in the document "matches up" against some element in the DTD unambiguously. I say that it fulfills that "role". Content models in subclasses are also "matched" against content models in their superclasses. Matching element types must be subclasses of the element type that they match in in the superclass, just as "FLEA" in "FLEA-CIRCUS" is a subclass of "ANIMAL" in "ZOO". Inheritance =========== Note also that the subclass in my proposal does not inherit anything. It re-declares a content model and an attribute list. I do not have any particular problem with a system in which things which are not redeclared in subclasses are presumed to be inherited from superclasses unchanged. But it is important to me that we understand that inheritance and subclassing are two different things and one can occur without the other. From here on in this essay, I *will* pretend that an inheritance rule has been designed. If an element declaration does not have a content model, it means that the content model is inherited. If it does not repeat a particular one of its superclasses' attributes, it means that the attribute has been inherited. This is just a convenience measure and does not change the basic concept of subclassing. Still convenience is important. Without inheritance, changes to the base classes must be manually propogated down to subclasses. In practice, this makes maintainance more difficult. Multiple Inheritance/Subclassing ================================== There is an interesting inversion in the document OO model vs. OOP. In OOP, multiple subclassing is widely understood to be useful, but multiple inheritance is highly controversial because of confusion that can be caused repeated inheritance and the repeated instance variables that repeated inheritance causes. In documents, the problem seems to be the opposite. Multiple subclassing is dangerous as inheritance is in OOP, because it may be ambiguous what role a subclassed element plays in a superclass's content model:

The "archform" solution to this problem is to require an attribute to specify what base class each element should be converted to, and thus what role it plays. I think that this requires authors to have too much knowledge of base classes. Another solution would be merely to outlaw multiple subclassing which introduces ambiguity. This may be too restrictive in some cases. A third solution might describe some form of "subclassing in context." This last solution is inspired by architectural forms and their use with SGML's Link feature to describe a particular SGML element as a subclass of one element when it appears in one context and of another in another context. These options must be thought through some more and tested with real DTDs. Multiple subclassing may turn out to be a point of weakness or confusion in our system just as multiple inheritance is in OOP. Motivation ========== You might be wondering: "is all of this just angels on the head of a pin?" Am I just trying to align SGML with OOP just for the sake of doing so? No not all. Real world SGML documents emulate subclassing and inheritance with parameter entities all of the time. Let me use as evidence the world's most popular DTD, HTML. The version I will use for my demonstration is HTML 3.2. Subclassing =========== HTML has a heading class defined like this: As you can see, headings have absolutely the same content model and attributes no matter where they occur. So their properties of completely inheritable. It turns out that one heading is always usable anywhere another is. They are *always* completely interchangable in content models. This would be modeled in a true subclassing/inheritance system like this: Each of the individual types of headings would inherit the attributes and content model from the parent. Headings would also be completely interchangable as they are now. The declaration for heading uses a "CLASS" keyword instead of an "ELEMENT" declaration because headings cannot be inserted in documents themselves. You must choose some subclass of heading to insert (H1 through H6). There are a few major benefits here over parameter entities. * For DTD designers, it allows easier construction of DTDs where there are places where elements are interchangable. * It now becomes possible to extend HTML by incorporating the HTML DTD into another, let's say, HTML-EXTENDED DTD. You can create a new heading type (let's say H7) merely by inheriting from heading. Your new heading will be treated as a heading by software that understands headings and is ignored by software that is specific to H1 through H6. This is exactly as you would expect. This easy extensibility of DTDs is the most interesting, powerful feature of subclassing. * For document authors, it presents the interesting possibility of class-organized element pick-lists in SGML authoring tools. There are points in HTML where there are many valid elements, as many as 20. Using classes, DTD designers could organize these into "headings", "lists", "multimedia" etc. The HTML DTD already emulates these classes but they are not available to SGML authoring tools because parameter entities have no real semantic beyond textual replacement. * Subclassing allows programmers to organize their programs to take advantage of the class hierarchy. We've already seen how this might work in Python. Let's look at DSSSL for this example: (element heading (make paragraph font-size: (- 20 (HEADING-LEVEL (gi (current-node)))))) This code would be triggered on any occurence of a heading element, including subclasses of heading. It would check the numeric suffix of the heading and choose an appropriate font size. The code continues to work with heading extensions as long as they continue the naming pattern (e.g. H7 and H8). If you didn't want to depend on this convention, you could treat other subclasses as H6. The equivalent DSSSL code for SGML as we know it would require a construction rule for each heading and there would be no easy way to handle new subclasses. Inheritance =========== Now here's a case from the same DTD where you want inheritance, but not subclassing. There are many objects in HTML that have href attributes that point to resources. Examples include A, AREA, LINK, BASE and so forth. We might want to define this attribute in one place and reuse it. OO programmers call this a "mixin". We also might not to. DTD authors should be careful. It is easy to abuse inheritance and inheriting one attribute at a time from various places can make DTDs hard to read, just as overuse of parameter entities can. Nevertheless, it may be useful to do this to centralize the definition of HREF. I will also demonstrate inheritance from a "shape" base class. Shape is a mixin that introduces "width" and "height" attributes. Now area inherits both sets of attributes. This means that a change to one of those base classes propogates down to the children classes. Of cousre if you make a change to the base that is incompatible with your documents, they will break, just as with parameter entities or classes in an OO language. On the other hand, the benefit of inheritance without subclassing is that you can always unhook area from one or both of its base classes without anything breaking. Inheritance relationships aren't reflected in content models of other elements or in the document itself. They are just a maintenance facility for DTD designers. Still, the direct encoding of inheritance hierarchies in DTDs would present interesting opportunities for graphical SGML DTD editing tools. Right now they emulate inheritance hierarchies with parameter entities. Inheritance hierarchy information is thus lost in the transmission from one such program to another. This would not be the case if SGML had a first-class inheritance feature. A Word About Architectural Forms ================================ Architectural forms were invented in the 10 years between SGML's creation and the time it was allowed to be revised. In the author's opinion, they reflect those limitations. Archforms allow subclassing at the application level, but do not convey any subclass information to the parser itself. Now SGML is up for revision, not just correction, but revision. In XML, we should take advantage of that fact and do the job right. We have a great opportunity here that was not available when archforms were invented. Subclassing ============ Architectural forms do not directly express the notion that element type A ISA subclass of element type B at the SGML parser level. Since archforms do not have parser level subclassing, the parser cannot check subclasses for conformance as OO language parsers do. Nor can DTDs be checked for conformance to meta-DTDs. More subtly, I don't think the true subclassing relationship finds any direct expression in the archform concept at all. Rather archforms allow you to express that individual elements of type A ARE-ALSO-OF type B. If every single element of type A conforms to this relationship, then you could *conclude* that element type A ISA subclass of element type B, but there is no direct relationship. This has some interesting implications: first there is no sense in which an "architectural DTD" is a "meta-DTD". It was never a DTD for DTDs (which is, I think, the most obvious meaning for meta-DTD), but it is not even a "DTD superclass" or "DTD base class" or anything similar. There need not be any relationship between element types in the DTDs at all and if there is, the relationship is by convention, not language features. To put this in terms of the model above, every element type in a derived "by default" subclasses from *every* element type in the base DTD. You must restrict the subclassing of a particular element by setting an attribute. You may restrict the subclassing of all of an element type by using a fixed attribute. Another problem is that there is absolutely no way to express that an element within the same DTD conforms to the interface (content model and attributes) defined by another element in the same DTD and is a valid replacement for that element. There are tricks that you can use with parameter entities to subclass from another DTD in the same physical file. You can even share may elements. But if we define a DTD to be a set of declarations, then you *cannot* directly inherit from an element type in the same logical DTD. Inheritance =========== Architectural forms do not allow any inheritance of element type "properties" at all. You cannot inherit attribute values or content models at the parser level. Even at the application level, the accuracy of calling what goes on in Architectural Forms "inheritance" is dubious. Rather it seems more like delegation, as in Microsoft's COM model. A particular element can support many "interfaces" and using attributes you can "delegate" between them so that a method (attribute) that is called one thing in one interface (architectural form) is called another thing in another. Safety ====== Architectural form "subclasses" (element types with fixed attributes) are not constrained by the architectural form mechanism to be strict element type subclasses. This means that in some cases an inappropriate or merely insufficiently strict mapping will allow an author to make a docuemnt which conforms to its DTD, but not to some architectural DTD. Disturbingly, there is almost nothing that, for instance, an SGML editor can do to meaningfully report this error. When I say "meaningfully report", I mean report in the terms of the DTD that the author is familiar with. After all, the error is not caught in this DTD and thus is not expressible in those terms. Instead you will get an error message like "XML-LINK not allowed at this point in content of RDF-AUTHOR-INFO element in RDF-SPEC meta-DTD" when the author's actual mistake was putting an element named "WebLink" in an element named "BiblioInfo" before an element named "AuthorInfo" instead of after it in a DTD called "THESIS-MARKUP-LANGUAGE". Proponents of architectural forms will argue that allowing non-strict subclassing allows flexibility that DTD designers need for mapping one DTD to another. This is a good point. One of Architectural forms great strengths is that they provide a (relatively) simple syntax for expressing mappings from documents conforming to one DTD to documents conforming to another. This is absolutely vital to maintain the balance between generic markup's extensibility and the need to standardize on particular DTDs expected by software. This role cannot be completely filled by constrained inheritance. Element types must be *designed* to be subclasses of other element types. But architectural mapping can be done even for documents that just "happen" to always conform. My argument in response is that XML already has a transformation language. XSL can be used to transform from generic XML to HTML. It would be a tiny extension to allow it to be used for generic XML -> XML transformations. If we then take a declarative, no-lookahead subset, we will have something that can be implemented right in the parser to build a grove conforming to a different DTD. It would also be more powerful than architectural forms. Though archforms mappings can theoretically be set up after both DTDs have been designed, my experience is that this rarely works in practice unless the superclass DTD is very flexible and non-prescriptive, because archforms are not a full transformation language and have very little power to move things around. So in a practical sense, I think that a standardized, simple transformation language is still needed and that subclassing will suffice for most of the cases where archforms work now. Syntax ====== I feel that attributes are a confusing place to put type information. People (rightly, IMO) expect element type information to be explicit in the DTD and invisible in the instance. I also think that people rightly expect the language to support specific keywords that provide inheritance, and not have inheritance "implied" through special "magical" attributes. Attributes made sense for archforms because SGML could not be changed. But times have changed. I believe that we need keywords and a parser-level understanding of subclassing semantics. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Wed Oct 8 07:44:35 1997 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:58:37 2004 Subject: Inheritance/defaulting of attributes Message-ID: <199710080548.PAA04628@jawa.chilli.net.au> > From: len bullard > Andrew Layman wrote: > > We are seeing that on this list also. Clearly inheritance is going to > > be difficult to work out. > Early in this list, object-oriented ideas were discussed > and rejected(? - assumption) for XML. By analogy, SGML > tools do not explicitly support or express inheritance. > But object-oriented databases have been created that > use SGML. Where in those designs is inheritance > explicitly supported? I think we should avoid "object-oriented". "object-oriented" is often not a useful term, since people seem to switch usage between OO as a historical technological movement and OO as some particular techniques: e.g. 1) class/instance, 2) methods, 3) messages, 4) inheritance. 1) SGML/XML clearly implements the class/instance style of OO: declarations and instances. Some people call this "object-based" rather than OO. 2) Steve Newcombe gave a very interesting talk I heard on why SGML is distinct from OO. Taking OO as the wish to bundle methods with data, he mentioned that SGML was based on a desire to *unbundle* methods (presentation) from data (logical structure) for documents. Of course, usually we only unbundle so that we can recombine logical structure and presentation in some other selection/sort/format, or to make sure that only the most optimal set of data is sent. However, SGML/XML can point have PIs and embedded or external scripts, so again it is possible to bundle or nominate methods to go sith the data (e.g. XSL). Tim Bray has mentioned that he found it easy to make an application by making a Java class for each element declaration. So SGML/XML is OO in the bundling sense. 3) An XML document can be a message. (From 1) & 2), the message has a class & can have methods attached.) 4) XML/SGML does not support (directly) the inheritance style of OO. (Indirectly it does, just with the discipline of a template, as in the examples I gave.) XML/SGML is not OOP, because it is not a programming language. However XML/SGML clearly makes a very nice couple with OOP. I think the most fruitful direction for XML-data is to figure out what information inheritance-based OOP systems need, and then to work out how to represent this in XML/SGML -- this is one area where I am sure XML/SGML may need enhancement. SGML was historically developed to allow certain types of processing: if we have new types of processing models that require new categories of information to be marked up, then XML/SGML should be extended accordingly. But lets see some evidence of the need before the solution! Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mtbryan at sgml.u-net.com Wed Oct 8 09:35:12 1997 From: mtbryan at sgml.u-net.com (Martin Bryan) Date: Mon Jun 7 16:58:37 2004 Subject: Inheritance/defaulting of attributes Message-ID: Paul I really like your inheritance paper, but would like to suggest a minor change. Firstly let me pick up on the name question. Because of the preconceived ideas that programmers have about what inheritance and subclassing involves I would love to get away from these terms. I really liked the term >Environmental Acquisition defined in > http://www.cs.technion.ac.il/~david/Papers/Tech_Reports/lpcr9507.PS.gz >in: > http://www.cs.technion.ac.il/~david/publications.html > This fits in beautifully with SGML as it incorporates the concept of being able to use context to determine the meaning of elements, which is one of the great plusses of SGML. This fits in nicely with what has been done in HyTime and DSSSL to allow querying of context within groves. The one area of concern I have with your proposal is over the position of INHERITS. Unlike the other keywords you have proposed this defines the attributes to be associated with the following named parent rather than a mapping to be applied to the following element name. I would like, therefore, to distinguish this fact by placing it after the model group rather than before it. e.g. This would have the advantage that you could then combine TYPEOF and inherits: >Architectural forms do not directly express the notion that element type A ISA subclass of element type B at the SGML parser level. Not at parser level, but at architectural processor level, where the architectural processor could be a spawned SGML parser working from the meta-DTD. Thats deliberate. >Since archforms do not have parser level subclassing, the parser cannot check subclasses for conformance as OO language parsers do. Nor can DTDs be checked for conformance to meta-DTDs. Again why presume it has to be done by a single parser? There may be good reasons for keeping these separate. >More subtly, I don't think the true subclassing relationship finds any direct expression in the archform concept at all. Rather archforms allow you to express that individual elements of type A ARE-ALSO-OF type B. This is the key - architectural forms are not subclasses, they are ways of identifying that processing relevant for a known class of objects are also relevent for this new class of object. They are about reusing existing knowledge/coding rather than about inheriting properties per se. This is the problem with using terms such as subclassing and inheritence to describe them. What architectural forms are about is Environmental Acquisition. ----------------------------------------------------------------- Martin Bryan, 29 Oldbury Orchard, Churchdown, Glos GL3 2PU, UK Phone/Fax: +44 1452 714029 E-mail: mtbryan@sgml.u-net.com For more information about The SGML Centre contact http://www.sgml.u-net.com For more information about the European Commission's Open Information Interchange (OII) initiative contact http://www.echo.lu/oii/en/oiistand.html xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peat at erols.com Wed Oct 8 12:51:26 1997 From: peat at erols.com (peat) Date: Mon Jun 7 16:58:37 2004 Subject: Subclassing and Inheritance in generic documents Message-ID: <01bcd3d8$849fe7c0$03efaccf@peat.erols.com> As you requested, your following note has been posted to: http://www.geocities.com/WallStreet/Floor/5815/ooxmledi.htm -----Original Message----- From: Paul Prescod To: w3c-xml-sig@w3.org ; xml-dev Date: Wednesday, October 08, 1997 12:07 AM Subject: Subclassing and Inheritance in generic documents >This is something I wrote in response to clarify my own ideas about >inheritance, subclassing, architectural forms and SGML documents. I hope >it is useful to others and will make it into a web page if it seems >useful to do so. Please ignore typos. Conversation on the document >should be in XML-DEV (or in private email) since inheritance is not on >the XML WG's immediate agenda (AFAIK). > >Thanks to Steven Newcomb and Peter Newcomb for their individual >comments on some of the ideas. This does *not* represent an >expression of concensus however. :) > >---- >Ideas about Subclassing and Inheritance in generic documents {snip} xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jarle.stabell at dokpro.uio.no Wed Oct 8 12:53:50 1997 From: jarle.stabell at dokpro.uio.no (Jarle Stabell) Date: Mon Jun 7 16:58:37 2004 Subject: Subclassing and Inheritance in generic documents Message-ID: <01BCD3E9.34472610@xyplex34.uio.no> Paul Prescod wrote: I believe that we need keywords and a parser-level understanding of subclassing semantics. [JS] Agree. Drop the parameter entities and replace them with more "semantic" constructs, and then it will be much easier to write good DTD tools. This would result in much better (at least many more) DTD tools, and many of them is likely to be free or at least very inexpensive. Jarle Stabell xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Wed Oct 8 15:25:11 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:37 2004 Subject: Subclassing and Inheritance in generic documents Message-ID: <343B8A29.CABEF593@technologist.com> Jarle Stabell wrote: > [JS] Agree. Drop the parameter entities and replace them with more > "semantic" constructs, and then it will be much easier to write > good DTD tools. I don't think I said quite that. Element type subclassing and property inheritance are only two of many things that parameter entities do. We would need to replace everything PEs do before we could remove them. This would take many more essays like mine on topics like DTD fragment reuse, DTD parameterization, conditional processing etc. Or else we would need to be convinced that those things can all be done properly with inheritance. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Wed Oct 8 15:45:50 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:37 2004 Subject: Inheritance/defaulting of attributes References: <199710080548.PAA04628@jawa.chilli.net.au> Message-ID: <343B8F05.8EDD3790@technologist.com> Rick Jelliffe wrote: > I think we should avoid "object-oriented". We have to avoid avoiding everything. :) I know that the concept of subclassing has expression in fields much broader than object oriented programming. It is a fundamental mathematical concept that goes back to -- I dunno, Aristotle? It is an accident of history that we now associate it with OOP. Inheritance does come from OOP. They found that it was useful to reuse properties of superclasses (methods and instance variables). I'm sure that it only took an hour for them to get tired of recopying code from superclasses to subclasses and to figure out that they need to share code. I think that we will come to the same conclusion about element types sharing attributes and content models. > "object-oriented" is > often not a useful term, since people seem to switch usage between > OO as a historical technological movement and OO as some particular > techniques: e.g. 1) class/instance, 2) methods, 3) messages, 4) inheritance. I haven't noticed this. I think of "OO" as the set of concepts shared by Simula, SmallTalk, C++, CLOS and Java. These are typically defined as "inheritance" (code reuse from parent to child), "encapsulation" (constraints on access to data) and "polymorphism" (one element "standing in for" another) I think that polymorphism could also be called "subclassing". My paper addressed inheritance and subclassing, but not encapsulation. I don't see encapsulation as being very relevant to generic documents. > 1) SGML/XML clearly implements the class/instance style of OO: declarations > and instances. Some people call this "object-based" rather than OO. Actually, any language with an extensible type system has classes and instances. Even Pascal and C. > SGML was historically developed to allow certain types of processing: > if we have new types of processing models that require new categories > of information to be marked up, then XML/SGML should be extended > accordingly. But lets see some evidence of the need before the > solution! I hope I provided that in my essay. An analysis of current DTDs indicates a real need to me. Most major DTDs reinvent subclassing and inheritance using the unstructured parameter entity mechanisms. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Wed Oct 8 16:06:59 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:37 2004 Subject: Inheritance/defaulting of attributes References: Message-ID: <343B93F3.FA645D21@technologist.com> Martin Bryan wrote: > > Firstly let me pick up on the name question. Because of the > preconceived ideas that programmers have about what inheritance > and subclassing involvesI would love to get away from these terms. There are costs to forging a new terminological way. I would like to think that the mechanism I have described will seem familiar to OO programmers as the logical generic markup versions of the concepts of subclassing and inheritance. This will help them learn about it. But then it is *my* proposal so I would say that. > I really liked the term > >Environmental Acquisition > defined in > > http://www.cs.technion.ac.il/~david/Papers/Tech_Reports/lpcr9507.PS.gz > >in: > > http://www.cs.technion.ac.il/~david/publications.html > > > This fits in beautifully with SGML as it incorporates the concept of being > able to use context to determine the meaning of elements, which is one of > the great plusses of SGML. This fits in nicely with what has been done in > HyTime and DSSSL to allow querying of context within groves. Environmental acquisition is *only* about context-based inheritance, and not at all about subclassing or class-based inheritance. It is about "runtime" and not "compile time". So I think that it is a separate concern. > The one area of concern I have with your proposal is over the position of > INHERITS. Unlike the other keywords you have proposed this defines the > attributes to be associated with the following named parent rather than a > mapping to be applied to the following element name. I would like, > therefore, to distinguish this fact by placing it after the model group > rather than before it. e.g. > > This would have the advantage that you could then combine TYPEOF and > inherits: > This seems like an excellent proposal. I will incorporate it in my next version. > >Since archforms do not have parser level subclassing, the parser cannot > check subclasses for conformance as OO language parsers do. Nor can > DTDs be checked for conformance to meta-DTDs. > > Again why presume it has to be done by a single parser? There may be good > reasons for keeping these separate. I guess I don't care if it is one processor or six, but I want the semantics of subclassing and inheritance to be checked in the DTD, and not only in instances. Since the SGML parser is often the only processor that has access to the DTD, it is the logical place to check it. > This is the key - architectural forms are not subclasses, they are ways of > identifying that processing relevant for a known class of objects are also > relevent for this new class of object. They are about reusing existing > knowledge/coding rather than about inheriting properties per se. Careful there! I think that you are using subclassing and inheritance interchangably. Subclassing is always about reusing existing knowledge/coding. What I am proposing is merely that you should be able to reuse knowledge/coding at every level -- including the SGML validation level. Archforms allow this reuse only at the application level (which makes sense considering that they did not have the option of changing SGML). > This is the > problem with using terms such as subclassing and inheritence to describe > them. What architectural forms are about is Environmental Acquisition. Not quite. Although they do preserve the EA concept of "context" (when they are used with LINK), they actually allow an element's (architectural) class to be determined based on context. This is not something I have ever encountered in any programming language. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Wed Oct 8 17:53:44 1997 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 16:58:37 2004 Subject: Inheritance/defaulting of attributes Message-ID: <199710081553.QAA10302@mail.iol.ie> A small amount of a modern OO language can go a long way towards providing a mechanism for wielding inheritance with SGML element types and spitting out 8879 DTDs. Just for fun, I doodled this in Python. N.B. this idea can be taken a *whole* lot further. Also, I'm sure Perl5, Java etc. can be wielded similarly. # Declare a class "Animal" derived from ElementType class Animal(ElementType): def __init__(self,gi): ElementType.__init__(self,gi) # All Animals have legs self.attrs["LEGS"] = ("NUMBER","#REQUIRED") # Declare a class "Dog" derived from "Animal" class Dog(Animal): def __init__(self,gi): Animal.__init__(self,gi) #Some dogs have Rabies self.attrs["RABIES"] = ("(YES,NO)","#REQUIRED") # Create Animal and Dog element types, printing out the attribute list declaration print Animal("MyAnimal") print Dog ("MyDog") This script prints :- The bass class "ElementType" is just this:- class ElementType: def __init__(self,gi): self.attrs = {} self.gi = gi # Method to print self def __repr__(self): res = "" return res Sean Mc Grath sean@digitome.com www.digitome.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Wed Oct 8 23:04:30 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:37 2004 Subject: Inheritance/defaulting of attributes References: Message-ID: <343BF5D2.FAFE42B1@technologist.com> Martin Bryan wrote: > Firstly let me pick up on the name question. Because of the preconceived > ideas that programmers have about what inheritance and subclassing involves > I would love to get away from these terms. I really liked the term > >Environmental Acquisition > defined in > > http://www.cs.technion.ac.il/~david/Papers/Tech_Reports/lpcr9507.PS.gz > >in: > > http://www.cs.technion.ac.il/~david/publications.html I've figured out why you like this term so much. You're in conflict of interest. Your book is referenced in the paper. :) Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Thu Oct 9 12:52:18 1997 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:58:37 2004 Subject: Inheritance/defaulting of attributes Message-ID: <199710091058.UAA27004@jawa.chilli.net.au> > From: Paul Prescod > I know that the concept of subclassing has expression in fields much > broader than object oriented programming. It is a fundamental > mathematical concept that goes back to -- I dunno, Aristotle? It is an > accident of history that we now associate it with OOP. The orginal targets of SGML DTD design were supposed to be editorial people, I think, not programmers. Editorial people have the idea of text replacement well and truly in their minds: it is what they do, and what they think they are doing when they do it. So the PE mechanism, being akin to macros or text replacement, is very appropriate. The ideas of inheritance and classing are not appropriate for editorial people for the same reason. Now XML is being much more targeted at database kind of systems, where the DTD designer is clearly more of a programmer. It may be that such programmers cannot think in terms of text replacement or symbolic computing, but in terms of whatever the fashionable OOP and framework are at the time. So it may be that they want a less text-based style of declarations. A push for OO constructs for element declarations actually creates a new distinction between declarations and the instance: the declarations would act by inheritance and magic and the document acts by text replacement: or is there some meaningful extension of inheritance to include instance data? Has anyone come up with a good solution for what order element types can be in if you have inheritance? is all very well, but what is the resulting content model? It sounds like people expect it to be (using the SGML "&" connector, which is not in XML): which is not very satisfactory IMHO. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu Oct 9 16:57:53 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:37 2004 Subject: Inheritance/defaulting of attributes References: <199710091058.UAA27004@jawa.chilli.net.au> Message-ID: <343CF162.D54A5FC9@technologist.com> Rick Jelliffe wrote: > Editorial people have the idea of text replacement well and truly > in their minds: it is what they do, and what they think they are > doing when they do it. > > So the PE mechanism, being akin to macros or text replacement, > is very appropriate. The ideas of inheritance and classing > are not appropriate for editorial people for the same reason. That isn't true. Anyone who can understand the concept of types and instances can understand the concept of sub-types. > Now XML is being much more targeted at database kind of systems, > where the DTD designer is clearly more of a programmer. No, this push for subclassing and inheritance has nothing to do with a re-targetting of SGML. Most major SGML DTDs (e.g. HTML, DocBook and TEI) have nothing in particular to do with databases and yet they all include a concept of subclassing and many use inheritance also. Anywhere you have object types people will ask for object sub-types. SGML has element types and thus people ask for element sub-types (or sub-classes). I do not believe that "editorial types" find the parameter entity hackery used in these DTDs to be "intuitive" because they are comfortable with text substitution. A straightforward encoding of the subtype relationships inherent in the information would be more intuititive. > It may > be that such programmers cannot think in terms of text replacement > or symbolic computing, but in terms of whatever the fashionable > OOP and framework are at the time. So it may be that they want > a less text-based style of declarations. As I said before, the concept of subtyping (or subclassing) predates OOP by at least *2300 years*. It comes from Aristotle, not Stroustrop. The programming languages community was bound to invent it for the same reason that we are bound to do so -- wherever there are types, some things will be better modelled with sub-types. Markup people would have invented it first if we had the budgets of programming language research groups. > A push for OO constructs for element declarations actually creates > a new distinction between declarations and the instance: the > declarations would act by inheritance and magic and the document > acts by text replacement: or is there some meaningful extension of > inheritance to include instance data? My proposal from yesterday does this. > Has anyone come up with a good solution for what order element > types can be in if you have inheritance? > > > > > is all very well, but what is the resulting content model? Well, this is inheritance which is different from subtyping. We can and should be clear that there is a distinction and you don't need one to have the other. According to my model from yesterday, the content model above is (cry). Just as in OOP, if you specify a property (in this case the content model) it should override that property in the parent class. Since you have not claimed (according to my straw person syntax) that kitty is a subtype of cat, the above is legal, but non-sensical. Why would you inherit a content model and then immediately override it? > It sounds like people expect it to be (using the SGML "&" > connector, which is not in XML): > > > > which is not very satisfactory IMHO. I don't see this interpretation as being very useful. If we want to allow inheritance of portions of content models (which is dubious in my mind) then this can be accomplished easily with new syntax. I don't know if the extra feature is worth the new syntax. Still if this is determined to be a requirement, it can be easily solved. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Thu Oct 9 17:37:10 1997 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:58:37 2004 Subject: Inheritance/defaulting of attributes Message-ID: <199710091543.BAA06780@jawa.chilli.net.au> This discussion probably should be on comp.text.sgml, so i wont continue it apart from this. > From: Mary Holstege > I don't think this is such a bad solution. Your only alternatives are to > have to define the content model all over again (which defeats the whole > purpose) or somehow be able to identify which part of the parent content model > the new stuff needs to insert itself after, which strikes me as too brittle > to be workable. I have proposed to ISO WG8 a keyword #ANY that can be allowed in a content model, meaning that one of any element type can be put there. This allows base elements types to maintain their integrity. > It is a mistake for ordering information to be part of the abstract syntax > declaration in the first place, IMHO. Presentation order is, well, a > presentation issue and has no business being declared anywhere other than in a > presentation rule (style sheet, if you will). I couldn't disagree more. Without a fixed ordering, you have to either use in-memory processing of data or you have to have an extra pass (or step) in your stream processing to reorder the data. Stream-based text processing is a very useful technique. Maybe memory is cheap enough now that, with virtual memory, we can forget about stream processing, but I don't think so yet. But the requirement that things should be declared before they are used (where possible) that underpins stream processing is greatly aided by fixed-order processing. If we don't have fixed orderings possible, we have to key everything. You may as well use a database! Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dgd at cs.bu.edu Thu Oct 9 18:24:01 1997 From: dgd at cs.bu.edu (David G. Durand) Date: Mon Jun 7 16:58:38 2004 Subject: Inheritance/defaulting of attributes Message-ID: Tim's post reinforces the very correct point that "document inheritance" is not very well defined. In my analysis of "things we do in DTDs that are inheritance-like" I came up with at least the folowing kinds of inheritance (I hope to work this into an article at some point): Content-model inheritance: sharing or extension of a content model of one element with another element. Content-model Context inheritance: sharing of a place of potential occurrence in the content model(s) of (an)other element(s). Attribute Inheritance: elements that inherit a particular attribute definition. Attribute List inheritance: elements that share packages of attributes that go together. XML's separate attlist declarations will make implementing this kind of inheritance with Parameter Entities a little easier. Inter-Attribute List inheritance: multiple attlists that extend each other, and apply to different elements. Ontological inheritance: for things like P, P.BLURB, P.EXPLANATORY. We sometimes have an organization of the elements into a conceptual structure of types and sub types -- any of the other kinds of things listed bere might (or might not) be inherited on the basis of such relations. Of course, these are really kinds of sharing that people implement through PEs or hand expansion. They are not always arranged in a hierarchical fashion, sometimes the elements in a DTD are just partitioned by sharing of certain attributes. For instance an architectural form like XML-LINK partitions elements into "link elements" and "other elements", without any explict notion of hierarchy of sub-typing. Of course, a DTD author's "ontological hierarchy" may affect how such sharing is actually structured in the DTD. Within an instance itself, we could also have inhertance -- that would be really inhertance and not sharing, since people have poposed that elements would inherit from their containing element. Things that have been proposed here are: Attribute Inheritance: Some attributes might acquire (default) values based on what they are contained in (or its attributes). Declaration Inheritance: Content models (or other DTD-declared properties) might change depending on what an element is contained in. This would enable elements like "name" to have different properties depending on whether they are contained in a "bilbiography" or a "product-description". This is an area where there are a lot of bright ideas, but little experience in practice. This could be the direction of the future for XML/SGML. My point in this is that we have (at least 5 distinct kinds of property to share) as well as an ontological hierarchy (which might be a rough equivalent to the real-world phenomena that underly O-O design proceses). In fact, architectural forms offer the possbility of multiple ontological hierarchies in the same DTD. In the document instance, we have all the same kinds of inheritance (which I bundled under declaration inheritance) plus "attribute value" inheritance. The advantage of document instance inhertance, is that the parent relationships are constained by the structure of the instance, whereas DTD relationships are restrained by the cleverness of the DTD author. None of these have the properties of the objects that OO design deals with, where we generally inherit field values, and functions from one domain to another domain. This means that we don't have to worry about covariant versus contravariant subtyping of methods, but it also means that OO is not _especially_ more relevant that any other knowledge description language. The properties we are inheriting are very different, and the strategies to control them will probably be different too. -- David xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matt at wdi.disney.com Thu Oct 9 18:35:01 1997 From: matt at wdi.disney.com (Matthew Fuchs) Date: Mon Jun 7 16:58:38 2004 Subject: Inheritance/defaulting of attributes In-Reply-To: Paul Prescod "Re: Inheritance/defaulting of attributes" (Oct 9, 10:59am) References: <199710091058.UAA27004@jawa.chilli.net.au> <343CF162.D54A5FC9@technologist.com> Message-ID: <9710090937.ZM4138@scrumpox.rd.wdi.disney.com> On Oct 9, 10:59am, Paul Prescod wrote: > Subject: Re: Inheritance/defaulting of attributes > Rick Jelliffe wrote: ... > > A push for OO constructs for element declarations actually creates > > a new distinction between declarations and the instance: the > > declarations would act by inheritance and magic and the document > > acts by text replacement: or is there some meaningful extension of > > inheritance to include instance data? > > My proposal from yesterday does this. > > > > > > > > > > Note that you have come up with new syntax to do something which in SGML would be handled through minimization, i.e., something like: The parser would then interpolate tags around each or tag, allowing the application to understand a relationship not obvious in the document text. (OK, so I also used parameter entities. So sue me.) Matthew -- ----------------------------------------------------- Matthew Fuchs matt@wdi.disney.com http://cs.nyu.edu/phd_students/fuchs ----------------------------------------------------- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu Oct 9 18:43:53 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:38 2004 Subject: Inheritance/defaulting of attributes References: <199710091058.UAA27004@jawa.chilli.net.au> <343CF162.D54A5FC9@technologist.com> <9710090937.ZM4138@scrumpox.rd.wdi.disney.com> Message-ID: <343D0A3E.1E28CEBB@technologist.com> Matthew Fuchs wrote: > The parser would then interpolate tags around each > or tag, allowing the application to understand a relationship > not obvious in the document text. But a containment relationship does not imply an ISA relationship in most ontologies. I don't want to make a new subtree with lion and animal nodes. A lion in an animal would seem to me to mean "WAS EATEN BY" :). Rather, I want to have a node which can be interpreted as either a lion or an animal. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Kenneth.J.Meltsner at jci.com Thu Oct 9 18:47:35 1997 From: Kenneth.J.Meltsner at jci.com (Meltsner, Kenneth J) Date: Mon Jun 7 16:58:38 2004 Subject: Prototype OO was Re: Inheritance/defaulting of attributes Message-ID: <8625652B.005C34FE.00@Corpnotes.JCI.Com> [Apologies for jumping in from outside the document/XML community -- I was reading the archive and found this thread interesting. An earlier attempt at sending this out failed, so this may be a duplicate as well.] The XML community may want to look at prototype-based object-oriented programming as an alternative to traditional class-instance objects. Prototypes allow the designer/user to create an object, change its properties, and then create new objects that inherit from the original prototype. There's a lot more details, including a description of inheritance by delegation of methods/properties to parent prototypes, but it tends to be a more useful approach for objects that model real world objects -- my dissertation used a variant of prototypes to implement an object-oriented approach to simulation for chemical thermodynamics. Another useful approach might be to look into constraint-based OO. In this approach, the relationships between objects can be described and manipulated, usually in both directions (i.e. if x.width + y.width = total.width, setting the total.width and x.width to new values will force a change in y.width as well). Finally, I was impressed with, but lost the references to, some work on "middle-out" modeling. Basically, traditional classes are used to go from general classes (animals, mammals) to specific classes (dogs, beagles), and then the system permits class properties to be replaced to allow for Ralph, a beagle with three legs, etc. A similar middle-out approach might be useful -- define a hierarchy of DTDs, I suppose, and then permit specific exceptions to override default properties. Here are the references for folks with extra time, if I'm not too off-topic. The classic prototype-based OO language developed at Sun: http://self.sunlabs.com/ A prototype-based, C++, constraint-based user interface system: http://www.cs.cmu.edu/Groups/amulet/amulet-home.html A bunch of variants on constraint-based languages and systems, including Web layout: (The hierarchy of constraints is especially useful for systems with defaults and different constraint strengths) http://www.cs.washington.edu/research/constraints/ Ken Meltsner xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From matt at wdi.disney.com Thu Oct 9 19:38:38 1997 From: matt at wdi.disney.com (Matthew Fuchs) Date: Mon Jun 7 16:58:38 2004 Subject: Inheritance/defaulting of attributes In-Reply-To: Paul Prescod "Re: Inheritance/defaulting of attributes" (Oct 9, 12:45pm) References: <199710091058.UAA27004@jawa.chilli.net.au> <343CF162.D54A5FC9@technologist.com> <9710090937.ZM4138@scrumpox.rd.wdi.disney.com> <343D0A3E.1E28CEBB@technologist.com> Message-ID: <9710091029.ZM4236@scrumpox.rd.wdi.disney.com> Yeah, but ontologies are semantics, and semantics isn't our bizness. We do syntax here. Seriously though, despite having written my share of OO (and other) code, I'd be very leery of anything that injects semantic notions into XML. Whether something is an ISA or HASA relationship may ultimately depend on the applications point of view. I.e., lion ISA animal, but lion also HASA superclass, depending on whether you are in the semantics of the zoo domain or the semantics of domain structures - and these can hopefully share the same semantics. But all I really wanted to point out was that in eliminating a lot of SGML cruft, we'd also eliminated some of the tricks people have used in the past to finesse OO. Matthew On Oct 9, 12:45pm, Paul Prescod wrote: > Subject: Re: Inheritance/defaulting of attributes > Matthew Fuchs wrote: > > The parser would then interpolate tags around each > > or tag, allowing the application to understand a relationship > > not obvious in the document text. > > But a containment relationship does not imply an ISA relationship in > most ontologies. I don't want to make a new subtree with lion and animal > nodes. A lion in an animal would seem to me to mean "WAS EATEN BY" :). > > Rather, I want to have a node which can be interpreted as either a lion > or an animal. -- xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu Oct 9 20:06:09 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:38 2004 Subject: Inheritance/defaulting of attributes References: <199710091058.UAA27004@jawa.chilli.net.au> <343CF162.D54A5FC9@technologist.com> <9710090937.ZM4138@scrumpox.rd.wdi.disney.com> <343D0A3E.1E28CEBB@technologist.com> <9710091029.ZM4236@scrumpox.rd.wdi.disney.com> Message-ID: <343D1D8C.2EECB086@technologist.com> Matthew Fuchs wrote: > Seriously though, despite having written my share of OO (and other) code, I'd > be very leery of anything that injects semantic notions into XML. IDREF isn't a semantic notion? How about the very concept of element "types". Isn't that semantic? > Whether > something is an ISA or HASA relationship may ultimately depend on the > applications point of view. I.e., lion ISA animal, but lion also HASA > superclass, depending on whether you are in the semantics of the zoo domain or > the semantics of domain structures - and these can hopefully share the same > semantics. True, but we can benefit from subclassing at the SGML/XML level, without even considering the needs of application designers. A survey of popular DTDs would demonstrate that most re-invent the concept of subtyping in a proprietary way. This makes reading, maintaining and extending the DTDs hard and processing them in (e.g.) a GUI editor even harder (basically the subtype relationships just disappear). One of very common question on comp.text.sgml is "how do I extend DTDs" and our only answer is: "by using this hack, if the DTD designer has allowed it, or this other hack, if they took a different approach ..." Archforms are not a mechanism for extending existing DTDs, but for allowing documents to conform to multiple DTDs (including existing DTDs). I find it downright embarrasing that we have no half-way decent answer to this question. Someone posed it last month on c.t.s as: "This is so basic I must have missed something obvious in the XML spec." I wonder if he ever got an answer... > But all I really wanted to point out was that in eliminating a lot of SGML > cruft, we'd also eliminated some of the tricks people have used in the past to > finesse OO. I reject the notion that the notion of subtyping is intrinsically related to OO, but I take your point. Minmization can be used to finesse many things and XML is poorer for not having it. Still, I find it uncomfortable to use minimization to make up for restrictions in the language. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From akirkpatrick at ims-global.com Fri Oct 10 11:35:00 1997 From: akirkpatrick at ims-global.com (akirkpatrick@ims-global.com) Date: Mon Jun 7 16:58:38 2004 Subject: Inheritance/defaulting of attributes Message-ID: Firstly, I thought Paul's little essay on subclassing and inheritance in DTDs was fascinating. Re the question below from Rick: as I understood Paul, if an element inherited a content model, it could define a more restrictive model for itself but still had to be valid according to the parent model (see below). The question I have is about restricting the use of generic elements in the instance. For example, suppose we have: ... .. (look familiar?) How do we stop the authors actually using and which have little meaning as they are? Do we need a "pure virtual" syntax which indicates that an element type cannot be instantiated in the instance? Either way, I like this idea a lot. Does the instance syntax proposal for XML take any of this into account? If so, I might be converted, on the assumption that this type of thing won't make it into the SGML revision. Alfie. ---------- From: ricko@allette.com.au Sent: Thursday, October 09, 1997 4:10 PM To: xml-dev@ic.ac.uk; Kirkpatrick, Alfie Subject: Re: Inheritance/defaulting of attributes -------------------------------------------------------------------------- -- Has anyone come up with a good solution for what order element types can be in if you have inheritance? is all very well, but what is the resulting content model? It sounds like people expect it to be (using the SGML "&" connector, which is not in XML): which is not very satisfactory IMHO. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Fri Oct 10 15:50:11 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:38 2004 Subject: Inheritance/defaulting of attributes References: Message-ID: <343E3287.4621A56E@technologist.com> akirkpatrick@ims-global.com wrote: > How do we stop the authors actually using and which > have little meaning as they are? Do we need a "pure virtual" syntax > which indicates that an element type cannot be instantiated in the > instance? Yes, in my essay I introduce a declaration ro abstract supertypes. > Either way, I like this idea a lot. Does the instance syntax proposal for > XML take any of this into account? If so, I might be converted, on the > assumption that this type of thing won't make it into the SGML revision. Well I think that something inheritance-like is going into the revision. We just have to push for it to be straightforward and "first-class" rather than indirected as archforms are. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Fri Oct 10 15:56:33 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:38 2004 Subject: Inheritance/defaulting of attributes References: Message-ID: <343E347C.4FFE7919@technologist.com> akirkpatrick@ims-global.com wrote: > Re the question below from Rick: as I understood Paul, if an element > inherited a content model, it could define a more restrictive model for > itself but still had to be valid according to the parent model (see > below). That's close, but not quite correct. According to my terminology, why you've described is subtyping. If you *inherit* a content model then you get that content model -- unchanged. We could invent straightforward mechanisms to allow you to inherit content model parts but I'm not sure yet if it is worth the extra syntax. Especially if we were to move to structured attributes. Then today's complex content models could be broken up into small components and the benefits of inheriting content model parts would be reduced. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From JohnGo at asymetrix.com Fri Oct 10 19:33:03 1997 From: JohnGo at asymetrix.com (John Gossman) Date: Mon Jun 7 16:58:38 2004 Subject: Prototype OO was Re: Inheritance/defaulting of attributes Message-ID: This is an excellent idea. For declarative languages especially, prototype is a very good model for inheritance because it provides default values for data fields and layouts of schemas, which is primarily what you need (there is no behavior to inherit). I have used this as the basis for my XML based OXF (Open Exchange Format). It seems to be easier to understand for those who are not fluent in OO technique. John Gossman Asymetrix Learning Systems >---------- >From: Meltsner, Kenneth J[SMTP:Kenneth.J.Meltsner@jci.com] >Sent: Thursday, October 09, 1997 9:44 AM >To: xml-dev@ic.ac.uk >Subject: Prototype OO was Re: Inheritance/defaulting of attributes > > > > [Apologies for jumping in from outside the document/XML community -- I > was reading the archive and found this thread interesting. An earlier > attempt at sending this out failed, so this may be a duplicate as > well.] > > The XML community may want to look at prototype-based object-oriented > programming as an alternative to traditional class-instance objects. > Prototypes allow the designer/user to create an object, change its > properties, and then create new objects that inherit from the original > prototype. There's a lot more details, including a description of > inheritance by delegation of methods/properties to parent prototypes, > but it tends to be a more useful approach for objects that model real > world objects -- my dissertation used a variant of prototypes to > implement an object-oriented approach to simulation for chemical > thermodynamics. > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From JohnGo at asymetrix.com Fri Oct 10 20:11:00 1997 From: JohnGo at asymetrix.com (John Gossman) Date: Mon Jun 7 16:58:38 2004 Subject: Prototype OO was Re: Inheritance/defaulting of attributes Message-ID: I guess an example is in order: later we can define an instance of button: The button instance picks up default values and types from the typedef (the prototype), and overrides the others. Furthermore, you can derive a new class: and a new-instance:

-JG >---------- >From: John Gossman >Sent: Friday, October 10, 1997 10:38 AM >To: 'xml-dev@ic.ac.uk'; 'Meltsner, Kenneth J' >Subject: RE: Prototype OO was Re: Inheritance/defaulting of attributes > > This is an excellent idea. For declarative languages especially, >prototype is a very good model for inheritance because it provides >default values for data fields and layouts of schemas, which is >primarily what you need (there is no behavior to inherit). I have used >this as the basis for my XML based OXF (Open Exchange Format). It seems >to be easier to understand for those who are not fluent in OO technique. > > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From jjc at jclark.com Mon Oct 13 13:23:50 1997 From: jjc at jclark.com (James Clark) Date: Mon Jun 7 16:58:38 2004 Subject: SP test release with improved XML support Message-ID: <344202D2.9C718162@jclark.com> The current Jade test release at ftp://ftp.jclark.com/pub/test/jade.zip includes an experimental version of SP with more XML support. Win32 binaries are also available at ftp://ftp.jclark.com/pub/test/jadew.zip. A number of key features from the WebSGML SGML TC are supported (with some differences from the balloted text): - Unbundling of SHORTTAG - HCRO delimiter (for hex numeric character references) - Feature to allow elements declared EMPTY to have end-tags - NESTC (net-enabling start tag close) delimiter (allows XML syntax to be handled as a combination of a net-enabling start-tag "") - Duplicate enumerated attribute tokens are allowed - Relaxation of rules on use of parameter entity references inside groups - Support for multiple ATTLIST declarations for a single element type - Support for ATTLIST declarations which don't declare any attributes - Support for predefined single character entities in the SGML declaration (lt, amp etc) - Support for feature that turns off SGML's traditional record end rules (WSCON KEEPALL) You need to use the included SGML declaration for XML (pubtext/xml.dcl) to take advantage of these features. Note that this declaration implements the recent decision to make XML case-sensitive. There is also support for the XML encoding declaration and for XML's rules on default selection of the encoding. This is enabled by specifying an encoding of "xml". You can use set SP_ENCODING=xml set SP_CHARSET_FIXED=yes to make this the default. This will produce UTF-8 output by default; you can override this with the -b option. As in previous releases, use -wno-valid to turn off (some) validation, and use -wxml to get warnings about violation of XML restrictions. There are still some areas where SP does not conform to the current state of XML, including: - There is no support for draconian error handling (although it's easy to build a layer on top of SP that enforces this) - Line ends are normalized to \r\n rather than to \n - No support for UTF-16 surrogates. This means you can't have numeric character references outside the basic multilingual plane. - XML's rules about < and & used as data always being entered with < and & are not enforced by -wxml - The -wno-valid option allows use of undefined elements and attributes but still produces errors if you supply a definition but do not conform to it If you find others, please let me know. This is a test release. For production use, I recommend using SP 1.2.1. James xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From lauren at sqwest.bc.ca Mon Oct 13 20:42:50 1997 From: lauren at sqwest.bc.ca (Lauren Wood) Date: Mon Jun 7 16:58:38 2004 Subject: DOM level 1 core Message-ID: The first public draft of the W3C Document Object Model Core Level 1 was released at the end of last week, at http://www.w3.org/TR/WD-DOM. The DOM is an interface to allow manipulation of HTML and XML documents in a platform- and language-neutral way, so obviously we want to know what XML developers think of what we've been working on. DOM Level 1 contains document navigation and manipulation, and the basic framework on which we'll be building the other parts of the DOM, like the event and style models. We are still working on the HTML and XML parts of the specification, which will contain functions that HTML script authors will probably use, and ways of getting at the DTD and other XML-specific things. We'll be releasing these for comments and feedback soon. The DOM WG and IG would appreciate feedback on whether the interface is going in the right direction, whether it will be useful, whether it's missing something important, whether it's too complicated and could be made simpler, etc. If you have any comments, please send them to www-dom@w3.org (to subscribe, email www-dom-request@w3.org with the subject "subscribe"). Lots of us are also on this mailing list, of course. cheers, Lauren -- Lauren Wood, SoftQuad, Inc. Chair, W3C DOM Activity xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Kenneth.J.Meltsner at jci.com Wed Oct 15 16:17:30 1997 From: Kenneth.J.Meltsner at jci.com (Ken Meltsner) Date: Mon Jun 7 16:58:38 2004 Subject: Prototype OO was Re: Inheritance/defaulting of attributes Message-ID: <3444D05A.CB9C2F56@jci.com> I was thinking about my own comment (a dangerous idea), and came to the conclusion that prototype OO would probably be best for a WYSIWYG-ish document type definition system. In the same way AMULET (Brad Myers's package at CMU) allows you to design a user interface and change the properties/presentation of the visible objects, a user could define new elements that would provide the desired presentation, etc. When saving the file, the elements would then be queried to produce the XML element definitions needed for the DTD. My point is that it's possible to use an object-oriented approach to manipulate document definitions interactively, and then generate definitions in a form suitable for use by other systems. Example: User would insert a "user-name" element, but would change it to another font. Selecting the element, the user could define a new element "internal-user-name" with the appropriate presentation and attributes inherited from "user-name" but with the font changed appropriately. I'm not sure whether prototype OO would make it easier to deal with manually-created DTDs, but it would be a heck of a boost for visual/WYSIWYG tool implementors. Apologies if the above is clueless in any fashion, Ken "Coming to XML development from a different world" Meltsner xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Wed Oct 15 18:47:02 1997 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:58:38 2004 Subject: Prototype OO was Re: Inheritance/defaulting of attributes Message-ID: <199710151654.CAA29256@jawa.chilli.net.au> > From: Ken Meltsner > My point is that it's possible to use an object-oriented approach to > manipulate document definitions interactively, and then generate > definitions in a form suitable for use by other systems. That would be an attractive system. Usually SGML is used with a version of the "model/view" paradigm, where SGML is used to represent the model of the information, nominally independent of the systems used to view the information. So any such OO system would be careful to make sure that accurate models can be generated that have view information abstracted out as much as possible. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From liamquin at interlog.com Wed Oct 15 20:30:16 1997 From: liamquin at interlog.com (Liam Quin) Date: Mon Jun 7 16:58:38 2004 Subject: Prototype OO was Re: Inheritance/defaulting of attributes In-Reply-To: <199710151654.CAA29256@jawa.chilli.net.au> Message-ID: On Thu, 16 Oct 1997, Rick Jelliffe wrote: > > From: Ken Meltsner > > > My point is that it's possible to use an object-oriented approach to > > manipulate document definitions interactively, and then generate > > definitions in a form suitable for use by other systems. > > That would be an attractive system. It reminds me of a presentation I saw at SGML '89 in Atlanta... on an "ideal SGML system"... by Pam Genusa or Paula Angerstein I think. > Usually SGML is used with a version of the "model/view" paradigm, > where SGML is used to represent the model of the information, nominally > independent of the systems used to view the information. So any such > OO system would be careful to make sure that accurate models can > be generated that have view information abstracted out as much as > possible. Well, the MVC (Model-View-Controller) paradigm is common not only with SGML, but on the Macintosh and in OO systems. It was first described (as far as I know) in 1980 or 1981 in the Smalltalk books -- there may have been earlier papers on the subject, though. SoftQuad's Sculptor SGML product used (uses, if they still sell it) an MVC paradigm, together with an object-oriented dialect of Scheme. Today one would probably use Java rather than Scheme because more programmers would like it, although Scheme goes well with DSSSL. But once you'd got over the rather steep learning curve, Sculptor is/was pretty powerful. I'd say also, take a look at Balise. SGML lends itself to an object view of the world in a lot of ways, although you have to be careful not to get too seduced: some aspects, such as marked sections, enties and comments are from the 1960s sort of macro text processing, and you end up having to impose all sorts of restrictions on the SGML you can accept if you're not careful. One example is that marked sections, entities and CONCUR can all be used to make a document that isn't a tree. RANK and CURRENT and other minimsation features can be used to make a document where an element's parse depends on previous elements.. which makes copy and paste exciting too :-) SoftQuad Sculptor doesn't give much access to the DTD. Ken, you may want to look at OCLC's FRED research project. I think this will be particularly interesting when used with DTD-less XML documents. Document definitions are much easier to manipulate when represented as SGML/XML instances (I claim), because then you can use general purpose tools (like Sculptor, Balise, Omnimark, SGMLS.pm etc). Then generate "old-style" DTDs when you need them, along with documentation, and no more doing things like this: This sort of significant-comment stuff is thrown away by most SGML tools -- e.g. it is not represented at all in the ESIS -- and you need to write specialised software to deal with it (rather like Javadoc or WEB). Obviously it belongs along with the element declaration, but it's equally clear that this isn't the best way to do it. For me, this is the strongest argument for using an SGML document to store the DTD, even if all you do is The B element is used to contain the title for a book; use J for journal titles and A for Article titles. Within the B element you can use the general running text elements such as superscript, italic, goldleaf and so forth. Book title within a citation (%lowLevelText;)* This is much more amenable to OO tools and methodologies. If you use an idref instead of the parameter entity: ()* you can write a DSSSL script to expand it, but you could also imagine doing automated reasoning based on what "lowLevelText" contains to produce uses-a/has-a diagrams, for example. A system written to do this sort of thing could, as you suggest, also export information in other formats, such as a database schema for an OO database, an RTF template (if you add style information, or with DSSSL) and so forth. Lee -- Liam Quin -- the barefoot typographer -- Toronto lq-text: freely available Unix text retrieval email address: l i a m q u i n, at host: i n t e r l o g dot c o m xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Kenneth.J.Meltsner at jci.com Wed Oct 15 21:11:31 1997 From: Kenneth.J.Meltsner at jci.com (Meltsner, Kenneth J) Date: Mon Jun 7 16:58:38 2004 Subject: Prototype OO was Re: Inheritance/defaulting of attrib Message-ID: <86256531.00696165.00@Corpnotes.JCI.Com> Myers indicated (* -- I can't remember whether it was in a paper or a private email) that a separate MVC system wasn't really needed in a constraint-based, prototype OO system like Garnet -- the constraint solver plays the role of the controller and the constraints defined the user's view of the underlying data or model. Amulet has special support for aggregation (the composition of individual objects into composites with a single interface -- lists of items and such) as well as the usual inheritance. The prototype system can generate children from composite objects which inherit the behavior (but not the actual instances) of the aggregated components. This would directly support the copying of a section of text, a heading, etc. to another spot or document, and the subsequent alteration of that section without changing the original. Ken Meltsner xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Thu Oct 16 00:26:33 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:39 2004 Subject: Prototype OO was Re: Inheritance/defaulting of attrib References: <86256531.00696165.00@Corpnotes.JCI.Com> Message-ID: <34454398.21DE70A1@technologist.com> There are many ways that markup can touch OO. I want to point out that we are talking about at least three different things: * inheritance and subclassing among XML element types -- I don't think that this really has much to do with OO at all. I do think that the XML standard should explicity support these features. * application-specific conventions for marking up OO patterns in XML documents. This is what the prototype example seemed to do. I do not think we need to nor should change the XML standard to support these types of concepts, but they are interesting examples of XML modelling of structures from other disciplines. * patterns for representing XML documents *in memory* *in applications* (i.e. MVC, constraint-based, aggregation). Perhaps I am the only one who got confused by the way the thread is sliding between these without differentiation, but if not, hopefully I've helped clarify it for someone. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From mrc at allette.com.au Thu Oct 16 01:44:08 1997 From: mrc at allette.com.au (Marcus Carr) Date: Mon Jun 7 16:58:39 2004 Subject: Prototype OO was Re: Inheritance/defaulting of attributes References: Message-ID: <34455522.CE173CE0@allette.com.au> Liam Quin wrote: > Document definitions are much easier to manipulate when represented as SGML/XML > instances (I claim), because then you can use general purpose tools (like > Sculptor, Balise, Omnimark, SGMLS.pm etc). Then generate "old-style" DTDs when > you need them, along with documentation, and no more doing things like this: If you have a licence of OmniMark you will find a number of sample programs that allow you to convert a fully featured DTD (including shortrefs etc) to an SGML document. Why not convert in that direction, when all you're looking for is a different view of the DTD? (If you dont have OmniMark, download the free version, OmniMark LE http://www.omnimark.com/develop/download/omle/31/ and send me mail.) > This is much more amenable to OO tools and methodologies. If you use an idref > instead of the parameter entity: > > ()* > > you can write a DSSSL script to expand it, but you could also imagine doing > automated reasoning based on what "lowLevelText" contains to produce > uses-a/has-a diagrams, for example. > > A system written to do this sort of thing could, as you suggest, also export > information in other formats, such as a database schema for an OO database, an > RTF template (if you add style information, or with DSSSL) and so forth. We do this sort of thing at the moment - we use a modified version of the sample OmniMark program to convert to SGML and then produce HTML for our programmers and markup teams. If you want some other rendition of the structure, then you should generate it - I still haven't seen a compelling reason for a full syntax overhaul. -- Regards Marcus Carr email: mrc@allette.com.au _______________________________________________________________ Allette Systems (Australia) email: info@allette.com.au Level 10, 91 York Street www: http://www.allette.com.au Sydney 2000 NSW Australia phone: +61 2 9262 4777 fax: +61 2 9262 4774 _______________________________________________________________ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri Oct 17 00:59:04 1997 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:58:39 2004 Subject: Weak DTDs In-Reply-To: <343D1D8C.2EECB086@technologist.com> References: <199710091058.UAA27004@jawa.chilli.net.au> <343CF162.D54A5FC9@technologist.com> <9710090937.ZM4138@scrumpox.rd.wdi.disney.com> <343D0A3E.1E28CEBB@technologist.com> <9710091029.ZM4236@scrumpox.rd.wdi.disney.com> Message-ID: <3.0.1.16.19971013182803.474f92fa@pop3.demon.co.uk> I am in the throes of revising CML (Chemical Markup Language - an XML-based application) and trying to work out what the value of conventional DTDs are. The previous version has a traditional SGML-like DTD - lots of parameter entities and other clever stuff. I am finding this too restrictive for several reasons, mainly because: (a) XML-* is moving so rapidly (e.g. LINK, STYLE, etc.) This is a Good Thing, but CML has to react to it. (b) RDF, DC, MathML etc will be involved in CML and I can't say exactly how at present. (c) My ideas on CML itself keep changing as I gain experience of new problems. I'd like *constructive* views on the value of DTDs in XML. [I know that the community has strongly held ones, so please avoid too much passion :-). There was a very interesting discussion a few weeks back on the aesthetics of DTDs - a good DTD is a thing of beauty.] I can see the following reasons for DTDs. (a) the author has to conform to a pre-defined spectrum of ideas (e.g. a tax-return). [This is not required for CML, and any conformance is outside what a DTD can deliver - e.g. value verification.] (b) the document may get corrupted in transmission or elsewhere. I suspect this is not a very important reason these days. (c) it *may* make it easier to develop authoring tools (d) it *may* give guidance to implementers of applications. (e) it should (but doesn't always) act as an incentive to develop human-readable documentation of the semantics. (f) it shows that the author has defined the language at some point in time. I'd be grateful for other reasons for CML I expect that (c-e) have some limited value. (f) may impress some people and horrify others. In creating CML documents I find myself: (a) wanting to introduce foreign names (e.g. , or ) These could reasonably come at many places in the document (b) forgetting my own 'rules', e.g. order of elements within a content model. So I can't expect others to follow them :-) (c) adding new components to content models - for good reasons. There is no reason why an cannot contain a

, but I didn't think of that earlier. I don't want to have to think of all combinations and ask 'is that reasonable?'. However the power of structured documents means that I can often use very fuzzily constructed documents. Thus: 'if a MOLECULE contains ATOMS and BONDS, the software can draw a picture' 'if any parent contains a FIGURE, allow that to be displayed by the reader'. 'if a VARiable has attribute BUILTIN=FOO, inform the software that it could process this with special FOO-specific code' and so on. These are powerful conditions, but if we try to express them in DTDs, validation will fail. What I'd like to have is a wildcard #ANY (this has already been suggested) which can be used for content models something like the (currently illegal) XML: This says that MOL can contain anything, but that ATOMS and BONDS have a special role. The authoring tool might present a menu with the items ATOMS, BONDS, Other. The software for MOL.java could contain routines to identify children: for (int i = 0; i < this.getChildCount(); i++) { Node n = getNode(i); if (n instanceof ATOMS) { /* atom-specific stuff */; natom++; } else if (n instanceof BONDS) { /* bond-specific stuff */; nbond++; } } if (natom > 0 && nbond > 0) { displayMol(); } Obviously this can't be written automatically, but the 'DTD' helps the author. In some cases there will be stricter rules such as: which clearly help both authoring tool authors and applications authors. At present I would like to keep a simple DTD but most of the content models will be 'ANY' and most of the attribute values will be CDATA. It would be nice to have attribute values which could take a list of values *and* CDATA :-) - like: which would inform the software that it should cater for three specific values, but that the user can add FOO if they really want. Any sympathisers out there :-)? P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Fri Oct 17 02:12:18 1997 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:58:39 2004 Subject: Weak DTDs Message-ID: <199710170019.KAA21495@jawa.chilli.net.au> > From: Peter Murray-Rust > (a) the author has to conform to a pre-defined spectrum of ideas (e.g. a > tax-return). [This is not required for CML, and any conformance is outside > what a DTD can deliver - e.g. value verification.] An SGML DTD can deliver value verification by using lexical typing. The online version of HyTime '97 has details. For example you can specify POSIX regular expressions for values of attributes or simple elements. You can use the lexical type mechanism with any lexical typing system of your own invention, not just POSIX. This extensibility is already there. > What I'd like to have is a wildcard #ANY (this has > already been suggested) which can be used for content models something like > the (currently illegal) XML: > > Why not have That only costs 1 extra level of tag, and fits into existing SGML & XML. If the function of a DTD is to abstract out invariant information from a class of documents. If your information is particularly variable and unforseeable then you need to use a couple more tags to represent what you intend. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From akirkpatrick at ims-global.com Fri Oct 17 14:20:11 1997 From: akirkpatrick at ims-global.com (akirkpatrick@ims-global.com) Date: Mon Jun 7 16:58:39 2004 Subject: Weak DTDs Message-ID: The strength of the DTD is in giving a limited set of possibilities for a processing engine to work with. There are obviously other ways to do this (see below) but for a lot of applications, the DTD provides sufficient constraints for authors of the information. A common example is a title element. Often a title is required to provide feedback in a UI, to act as link text in a hypertext link, etc. If your DTD says: then you know for a fact that you can pick out the title, given a valid document. Also, the parser will tell you if the document is valid or not and you can then decide whether to attempt processing it. In our application, the RTF processing engine will still attempt to process a document but says "hey, you might not get what you expect". In other situations, an application just says "go away and come back with something valid". It sounds like in your situation, you aren't worried about the vast majority of elements but just want to pick up on key things like , , etc. The "Eliot" way to do this would be with an architecture DTD which defines attributes to identify important elements. Your derived DTD can then use any content model (or even element names) you want. For example: Your derived DTD might then go something like: (I'm still new to AFs, but this is the basic idea) Now your processing engine can identify items by their fixed attributes and process according, ignoring all other elements. Other people can happily derive from your architecture DTD to add their application specific elements. If you are using XML without a DTD, things are exactly the same except that you need to explicitly set the attribute on the relevant elements (as I understand it). It should be trivial to write a normaliser which would generate XML from an SGML instance (SGMLNORM would probably do it). I think one of the major problems with the Web today is the plethora of badly formed HTML pages which have been allowed to grow and florish by browsers which don't check for validity in any way at all. There is a danger that lack of DTDs in XML documents will lead to even greater "tag soup". ---------- From: peter@ursus.demon.co.uk Sent: 17 October 1997 08:21 To: xml-dev@ic.ac.uk Subject: Weak DTDs -------------------------------------------------------------------------- -- I am in the throes of revising CML (Chemical Markup Language - an XML-based application) and trying to work out what the value of conventional DTDs are. The previous version has a traditional SGML-like DTD - lots of parameter entities and other clever stuff. I am finding this too restrictive for several reasons, mainly because: (a) XML-* is moving so rapidly (e.g. LINK, STYLE, etc.) This is a Good Thing, but CML has to react to it. (b) RDF, DC, MathML etc will be involved in CML and I can't say exactly how at present. (c) My ideas on CML itself keep changing as I gain experience of new problems. I'd like *constructive* views on the value of DTDs in XML. [I know that the community has strongly held ones, so please avoid too much passion :-). There was a very interesting discussion a few weeks back on the aesthetics of DTDs - a good DTD is a thing of beauty.] I can see the following reasons for DTDs. (a) the author has to conform to a pre-defined spectrum of ideas (e.g. a tax-return). [This is not required for CML, and any conformance is outside what a DTD can deliver - e.g. value verification.] (b) the document may get corrupted in transmission or elsewhere. I suspect this is not a very important reason these days. (c) it *may* make it easier to develop authoring tools (d) it *may* give guidance to implementers of applications. (e) it should (but doesn't always) act as an incentive to develop human-readable documentation of the semantics. (f) it shows that the author has defined the language at some point in time. I'd be grateful for other reasons for CML I expect that (c-e) have some limited value. (f) may impress some people and horrify others. In creating CML documents I find myself: (a) wanting to introduce foreign names (e.g. , or ) These could reasonably come at many places in the document (b) forgetting my own 'rules', e.g. order of elements within a content model. So I can't expect others to follow them :-) (c) adding new components to content models - for good reasons. There is no reason why an cannot contain a

, but I didn't think >of that earlier. I don't want to have to think of all combinations and ask >'is that reasonable?'. Peter has run head-on into one of the fundamental problems with DTDs as currently defined by SGML (and XML): we want them to describe *classes* of documents when they actually describe *individual* documents (and are incapable of defining classes of documents except in very weak ways). It was clearly the intent of the SGML designers that DTDs describe *classes* documents (thus the term 'document type'). Unfortunately, by making the DTD declarations a property of individual documents, they are prevented from being used in that way except in the most draconian fashion: all documents of a type must have *exactly* the same rules (because they all share exactly the same declaration set as part of their syntactic content). Valiant attempts at making configurable declaration sets, typified by the TEI and Docbook, simply emphasize the problem: there is no useful way with DTDs alone to define flexible document classes that can be easily specialized at the document level. Draconian rules are fine when your use scenario requires draconian policies, such as when creating military documents or documents that drive well-defined and specific processes. However, not all uses of SGML require draconian policies (i.e., the TEI). XML, in particular, is expressly designed for situations that *probably don't* require draconian policies (as evidenced by the potential lack of any DTD declarations). In other words, there is a continuum of possible constraint policies, from no variation allowed to anything is allowed. Unfortunately, SGML only really supports the 'no variation' end of that spectrum and XML only really supports the 'anything is allowed' or the 'no variation' ends, with no obviouis support for the middle ground, where you want some constraints but not necessarily full constraint. Thus the frustration that Peter describes is unavoidable with DTDs alone: he has clearly defined a general document type, the CML, that needs to allow a range of specialization options. However, if the CML is defined as a set of declarations to be used directly in documents as their DTD declarations, it cannot do that, as the declarations define the *complete set* of constraints on those documents. The CML must either impose arbitrary constraints that are necessarily appropriate for all CML documents or it must be so loose as to define no constraints beyond type names. In short: DTDs don't define document classes. The use of parameter entities to create configuratable declaration sets is a very weak way of expressing the allowed range of specialization, one that depends entirely on syntax tricks and conventions and one that cannot be reliably machine processed (it is impossible to impute meaning to the names and/or positions of parameter entities in the geneal case). And one that cannot be used at the document level with any of the commercial SGML editors I'm familiar with (because none allow element or attribute declarations in the internal subset). This is why something like architectures is required for the productive and large-scale use of SGML and XML: you must have a way to define true document classes with clear, machine-processible and validatable specialization constraints that dont', at the same time, impose unnecessary constraints on individual documents. SGML architectures, as defined by the AFDR (http://www.ornl.gov/sgml/wg8/docs/n1920/html/clause-A.3.html), provide such a mechanism. An architecture is defined by the *combination* of a set of DTD declarations and accompanying documentation that together define the rules for a class of documents (the documentation is vitally important because there will always be rules and constraints that cannot be expressed through syntax, regardless of what syntax you are using to formally express constraints). As part of these rules, the range of allowed variation among documents that confrom to the class can be defined, both formally in the syntax and completely in the documentation. The DTD declarations form a "meta-DTD", that is a DTD that defines the syntactic rules for the class, not for instances. Instances will have their individual DTDs (explicit or implicit) that define their individual syntax rules. Architectures can themselves be derived from other architectures, allowing you to form a hierarchy of document classes. By the same token, any architecture can be used as the base for a more specialized architecture. In addition, a single document or architecture can be derived from many different architectures (for example, the CML might be derived in part from some RDF architecture in order to standardize the way the CML structures metadata). Because architectures are defined using normal DTD syntax, any existing DTD declaration set can be used as an architecture without modification (although most existing DTDs can benefit from some redesign in order to make them better architectures). Thus, the CML, in the abstract, is clearly an architecture in the general sense: it defines the rules for a class of documents. It does (or needs to) define specialization constraints. The current definition of the CML includes a declaration set... ...Thus, the CML is an SGML architecture because the CML DTD can be used as an architectural meta-DTD (with the possible addition of a few small changes to better express its specialization constraints). To use this architecture with documents, you need to define a mapping between the elements, attributes, and data of the document with the elements and attributes in the architectural meta-DTD. The AFDR mechanism does this with attributes and provides a natural automatic mapping mechanism so that documents that are very similar to their meta-DTDs need provide mappings only for those things that differ from the meta-DTDs (that is, those things that are specialized beyond what the architectures define). [...] >These are powerful conditions, but if we try to express them in DTDs, >validation will fail. What I'd like to have is a wildcard #ANY (this has >already been suggested) which can be used for content models something like >the (currently illegal) XML: The idea of a "wildcard" for content models is expressed in the AFDR by the notion of "bridging" element forms, "bridging" in the sense that they bridge between the architecture and non-architectural stuff. In the meta-DTD, a bridging form simply says "anything can go here". Thus, rather than saying the following in the document's declarations: You would say this in the meta-DTD: This is essentially the same as what Rick suggested, except that we're doing it in the meta-DTD, rather than the document's DTD (the document may not have a DTD). To define the mapping from a document to a governing architecture, you declare the architecture and then define the mapping. In the AFDR as written the architecture is declared using a NOTATION declaration [several of people, including myself and Peter Newcomb, have suggested alternative PI-based mechanisms for doing these declarations as XML doesn't yet provide data attributes, which the AFDR mechanism relies on--what's important is making the connection, not the precise syntax by which it is made.]. A document that is derived from the CML and takes advantage of the above might look like this: ]> ...[normal CML stuff] ... ... ... By the normal rules of automatic architectural mapping, I only had to explicitly map the 'MyElement' element to something in the CML meta-DTD because everything else used the same names as in the meta-DTD. This means that I didn't need any other DTD declarations in the document in order to be able to interpret it as a CML document (that is, as a document that conforms to the general rules defined by the CML architecture). To process it as a CML document, I can simply derive the "architectural instance" using an architecture processor like SGMLNORM: sgmlnorml -A CML mydoc.sgml > cmlai.sgm The code samples Peter shows in his note could easily be used for architecture-aware processing simply by looking at the result of the architectural mapping rather than directly at element types. Resolving architectural mapping in an ad-hoc way requires about 20 lines of code if you make some reasonable assumptions about the use of the architecture (assuming you aren't prepared to do fully-general architectural processing involving actually loading the meta-DTD, which you don't usually need to do for most purposes). To define the sort of attribute constraints Peter wants, you must still rely on either documentation that states the rules that must then be enforced by an architecture-aware processor or you have to use something like the lextype facility in ISO/IEC 10744 Annex A.2. However, if you're building a processor for a specific architecture (i.e., a CML-aware processor), building in rules for specific attributes isn't a big deal and is no different than the sorts of things people do in specialized SGML processors every day. The architecture does give you a central place to put the documentation of the constraint and lets you make your implementation as generalized as you want (or have time for). Thus we can use architectural meta-DTDs to really and truly define the syntactic rules for classes of documents and then create documents that are specialized from those classes. The specialization rules are (mostly) machine processible and enforceable (there will always be semantic rules that can't be enforced by syntax alone). Because of automatic mapping, documents derived from architectures need have no explicit declarations of their own except as needed to express specific specializations (as shown above). Note that if you have an existing SGML document with an explicit DTD, you can make that SGML document into a DTD-less XML document simply by using the existing DTD as an architectural meta-DTD. This removes the necessity of parsing the declarations with the document any time you want to parse it without removing the connection between the document and its syntactic and semantic constraints (thus allowing validation on demand). This is particularly useful when the DTD you use is large (e.g., Docbook, full TEI, etc.). This then continues to beg the question: why have DTDs for documents at all? In fact, most documents need never have a full set of explicit declarations if they are derived from an architecture if they are also well formed. The only time you'd need explicit declarations would be to define specializations or to drive non-architecture-aware authoring or validation. But wouldn't it be cool if XML editors *were* architecture aware such that you could say "I want to create documents that conform to architecture X" and the editor would determine and enforce the specialization rules, letting you define new element types (or modify existing ones) and either warn you when you were doing something outside the architecture or prevent you from doing something outside the architecture (depending on what your local specialization policies are)? I think so. In fact, I think this is the only way you can have a useful XML editor at all [I find it interesting that the ADEPT*Editor product has had for many years a non-SGML-conforming mechanism for creating specialized element types while editing, although ADEPT does it through the use of PIs and creates documents that are really only processible in that form by ADEPT tools. But clearly they recognized a stronge requirement to allow specialization of documents by authors--unfortunately, no architectural mechanism, certainly not a standardized one, existed at the time they built that facility. I wonder how difficult it would be to make ADEPT into an architecture-aware editor that provided the same specialization facilities it does now, but expressed using the AFDR syntax rather than the proprietary ADEPT syntax? Certainly the work that Paul Grosso has done to demonstrate XML editing and on-the-fly element declaration suggests it might be possible, even if it requires something of a hack in the short term.] If you don't have an editor like this, then you are requiring the author to know the architecture's rules, which as Peter points out, can be difficult, even when you are the creator of the rules to begin with. In other words, "DTD-less authoring" is not attractive for most people because most people create documents that need to have at least some minimal consistency with other documents. My personal feeling is that without architectures [in the general sense, not necessarily using the AFDR mechanism, although I think the AFDR is a very good mechanism] that neither SGML nor XML are really very useful--meaning that architectures are required to use SGML and XML at large scales and across wide domains. Almost all the problems people have with using SGML at large scales come not from technological limitations but from limitations in the ability of document types alone to define document classes and the inability of SGML processors to operate at the class level, rather than the document level. Having said that, let me stress that SGML and XML are still the best thing going for creating structured documents. Obviously, we need to add the architectural mechanism to SGML and XML, not discard them in favor of something else. I think publication of ISO/IEC 10744:1997 demonstrates the desire to do this addition and, in fact, accomplished it (at least within the constraints of 8879:1986--there's lots of room for improvement to this mechanism as the syntax of SGML is improved through the SGML revision). Cheers, Eliot --

W. Eliot Kimber, Senior Consulting SGML Engineer Highland Consulting, a division of ISOGEN International Corp. 2200 N. Lamar St., Suite 230, Dallas, TX 95202. 214.953.0004 www.isogen.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri Oct 17 15:43:31 1997 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:58:39 2004 Subject: Weak DTDs In-Reply-To: <199710170019.KAA21495@jawa.chilli.net.au> Message-ID: <3.0.1.16.19971017143404.21c7a1de@pop3.demon.co.uk> Thanks Rick, At 10:13 17/10/97 +1000, you wrote: > >> From: Peter Murray-Rust > >> (a) the author has to conform to a pre-defined spectrum of ideas (e.g. a >> tax-return). [This is not required for CML, and any conformance is outside >> what a DTD can deliver - e.g. value verification.] > >An SGML DTD can deliver value verification by using lexical typing. >The online version of HyTime '97 has details. For example you can >specify POSIX regular expressions for values of attributes or >simple elements. You can use the lexical type mechanism with >any lexical typing system of your own invention, not just POSIX. >This extensibility is already there. Can this be incorporated into an XML DTD? I don't immediately see how... > > >> What I'd like to have is a wildcard #ANY (this has >> already been suggested) which can be used for content models something like >> the (currently illegal) XML: >> >> > >Why not have > > > > >That only costs 1 extra level of tag, and fits into existing SGML & XML. > Thanks - this seems like a smart idea, which can be extended. Of course the 'ANY' element is as about as real as the 'press ANY key' :-) > >If the function of a DTD is to abstract out invariant information from >a class of documents. If your information is particularly variable and >unforseeable then you need to use a couple more tags to represent what >you intend. Yes. And I suppose I can do the same thing with attributes. P. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri Oct 17 17:33:58 1997 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:58:40 2004 Subject: Weak DTDs In-Reply-To: Message-ID: <3.0.1.16.19971017145935.1d7fe7fe@pop3.demon.co.uk> Thanks very much, At 11:04 17/10/97 +0000, akirkpatrick@ims-global.com wrote: [... useful stuff clipped ...] >If you are using XML without a DTD, things are exactly the >same except that you need to explicitly set the attribute on >the relevant elements (as I understand it). It should be trivial >to write a normaliser which would generate XML from an SGML >instance (SGMLNORM would probably do it). > >I think one of the major problems with the Web today is the >plethora of badly formed HTML pages which have been allowed >to grow and florish by browsers which don't check for validity >in any way at all. There is a danger that lack of DTDs in XML >documents will lead to even greater "tag soup". What I am proposing is a smallish number of tags (perhaps 10-20) but without fixed rules for their content models. I intend to define the tags carefully, but not necessarily their combination. So perhaps not 'soup' but 'jelly'. I also expect to interoperate with other people's tags and it looks like DC: and RDF: will have similar approaches - i.e. the tags themselves are understood, but their content model is jelly. P. > Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Fri Oct 17 17:41:03 1997 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:58:40 2004 Subject: Weak DTDs In-Reply-To: <3.0.32.19971017081822.009f1af8@swbell.net> Message-ID: <3.0.1.16.19971017153616.2a978288@pop3.demon.co.uk> Many thanks Eliot, At 08:18 17/10/97 -0500, W. Eliot Kimber wrote: [... another XML jewel ...] This is (as always) extremely useful. I will have to think carefully about whether there is a generic mechanism that we can use effectively. The role of the authoring tool is important - at present I am hoping that most CML files will be formed from medium size chunks, probably converted on-the-fly from legacy documents. P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From dak at sq.com Fri Oct 17 19:25:14 1997 From: dak at sq.com (dak@sq.com) Date: Mon Jun 7 16:58:40 2004 Subject: Prototype OO was Re: Inheritance/defaulting of attributes In-Reply-To: Your message of "Wed, 15 Oct 1997 14:29:23 EDT." Message-ID: [Lee mentioned (Hi, Lee!), in passing comment about the Model-View-Controller paradigm]: | SoftQuad's Sculptor SGML product used (uses, if they still sell it) an MVC | paradigm, together with an object-oriented dialect of Scheme. Today one | would probably use Java rather than Scheme because more programmers would | like it, although Scheme goes well with DSSSL. But once you'd got | over the rather steep learning curve, Sculptor is/was pretty powerful. We're still selling it, I'm happy to say. One minor point of clarification (not to distract from this very interesting conversation about OO models!): [...] | SoftQuad Sculptor doesn't give much access to the DTD. Hmm. I think I'd put that another way: Sculptor's access to the DTD is abstracted for editing, at the level of: 1) You can get access to the definitions of atomic items in the DTD (elements, attributes, general entities, parameter entities, notation entities), and 2) You can get access to some of the relations between these (e.g. for elements: element-content-type, attribute-definition-list) but 3) Access to the content model is: 3.1) by type, if you ask the document or DTD: ANY, EMPTY, CDATA, RCDATA, MIXED, ELEMENT, or CONREF, and 3.2) by the various DTD constraints on how the document manipulation primitives act, which can also be pre-tested, by (e.g.) can-insert?. ...which is when it's needed in the editing environment. Sometimes, it would be nice to have more, but you can generally synthesize what you need, which is a large part of the advantage of having a general- purpose, dynamic programming language in the scripting environment. [End of digression. Back to the regularly scheduled abstract discussion about OO in the markup environment.] Best, Dak xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Fri Oct 17 23:03:00 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:40 2004 Subject: Weak DTDs References: <3.0.32.19971017081822.009f1af8@swbell.net> Message-ID: <3447D309.97AB1401@technologist.com> W. Eliot Kimber wrote: > Peter has run head-on into one of the fundamental problems with DTDs as > currently defined by SGML (and XML): we want them to describe *classes* of > documents when they actually describe *individual* documents (and are > incapable of defining classes of documents except in very weak ways). I don't see how you can say that. Clearly the TEI DTD does not define an *individual document*. It defines a class of documents with certain constraints. Peter merely wants those constraints loosened a little. We can do that in part by providing the #ANY keyword he asks for (or directing him to the workaround), by pointing him to SGML's "&" operator that can loosen ordering constraints (sorry, not in XML!) and by providing subclassing mechanisms that allow people to build upon the CML DTD and re-add constraints incrementally. > Draconian rules are fine when your use scenario requires draconian > policies, such as when creating military documents or documents that drive > well-defined and specific processes. However, not all uses of SGML require > draconian policies (i.e., the TEI). XML, in particular, is expressly > designed for situations that *probably don't* require draconian policies > (as evidenced by the potential lack of any DTD declarations). In other > words, there is a continuum of possible constraint policies, from no > variation allowed to anything is allowed. Unfortunately, SGML only really > supports the 'no variation' end of that spectrum and XML only really > supports the 'anything is allowed' or the 'no variation' ends, with no > obviouis support for the middle ground, where you want some constraints but > not necessarily full constraint. XML allows you to specify certain element types and leave others unspecified. Many of us have argued that we should explicitly define the result of piece-wise validation. It only makes sense that if declarations are provided there should be an option to validate against them. This will give you and Peter the middle ground you need. > Thus the frustration that Peter describes is unavoidable with DTDs alone: > he has clearly defined a general document type, the CML, that needs to > allow a range of specialization options. However, if the CML is defined as > a set of declarations to be used directly in documents as their DTD > declarations, it cannot do that, as the declarations define the *complete > set* of constraints on those documents. The CML must either impose > arbitrary constraints that are necessarily appropriate for all CML > documents or it must be so loose as to define no constraints beyond type > names. > > In short: DTDs don't define document classes. That just isn't true. DTDs have *always* defined document classes. SGML Handbook page 124 "Document type: A class of documents having similar characteristics; for example journal, article, technical manual, or memo" Yes, the facilities for defining those classes are a) a little too strict and b) not well designed for incremental extension. But we can attack both of those problems *directly* without introducing another "level" of processing in the way that architectural forms do. In any type system, from Aristotle's classification of animals to Simula's simulation of real-world type systems, the mechanism for making more flexible classification rules is subclassing and we can add this directly to SGML with straight-forward semantics. > This is why something like architectures is required for the productive and > large-scale use of SGML and XML: you must have a way to define true > document classes with clear, machine-processible and validatable > specialization constraints that dont', at the same time, impose unnecessary > constraints on individual documents. True, something is needed. But I do not see why it must be a new level of processing, an "architecture" when the DTD needs only to be made more flexible. > An architecture is defined by the *combination* of a set of DTD > declarations and accompanying documentation that together define the rules > for a class of documents (the documentation is vitally important because > there will always be rules and constraints that cannot be expressed through > syntax, regardless of what syntax you are using to formally express > constraints). As part of these rules, the range of allowed variation among > documents that confrom to the class can be defined, both formally in the > syntax and completely in the documentation. The DTD declarations form a > "meta-DTD", that is a DTD that defines the syntactic rules for the class, > not for instances. Instances will have their individual DTDs (explicit or > implicit) that define their individual syntax rules. But this is the definition of "Document Type Definition". See page 126. You've just paraphrased it. A DTD defines the allowed occurrence of elements and attributes for a class of instances. A "meta-DTD" (I prefer the term "architectural DTD") defines the allowed occurrence of architectural elements and architctural attributes for a class of instances. It's basically the same thing, except one uses the straightforward SGML syntax and the other uses the architectural syntax. Both work on classes of documents. They are not inherently "more flexible" than DTDs at all. One way that they are flexible, I'll admit, is that they allow piece-wise validation which SGML has not had in the past. But there is no reason that we cannot add this directly to SGML and XML. I wouldn't be sure if the SGML WebTC adds this already, but I'm not sure. > Architectures can themselves be derived from other architectures, allowing > you to form a hierarchy of document classes. By the same token, any > architecture can be used as the base for a more specialized architecture. > In addition, a single document or architecture can be derived from many > different architectures (for example, the CML might be derived in part from > some RDF architecture in order to standardize the way the CML structures > metadata). The word "derived from" can be very misleading in this context. Basically, you include a few processing instructions and notation declarations. You do not inherit any declarations. There are no constraints placed on the DTD. This DTD: could be "derived from" RDF or CML with a couple of extra notation statements. But instances conforming to this DTD are not constrained at the SGML level to be valid RDF or CML instances at all. As you can see, this particular DTD is actually much more flexible than RDF. > ...Thus, the CML is an SGML architecture because the CML DTD can be used as > an architectural meta-DTD (with the possible addition of a few small > changes to better express its specialization constraints). To use this > architecture with documents, you need to define a mapping between the > elements, attributes, and data of the document with the elements and > attributes in the architectural meta-DTD. The AFDR mechanism does this > with attributes and provides a natural automatic mapping mechanism so that > documents that are very similar to their meta-DTDs need provide mappings > only for those things that differ from the meta-DTDs (that is, those things > that are specialized beyond what the architectures define). Right. In the AFDR, the mapping from elements to architectural element types is done through attributes. In DTDs, the mapping is done with GIs and attribute names. I don't think it follows from that that architectures are inherently more flexible than DTDs. They seem to have almost the same flexibility modulo piece-wise validation. > The idea of a "wildcard" for content models is expressed in the AFDR by the > notion of "bridging" element forms, "bridging" in the sense that they > bridge between the architecture and non-architectural stuff. In the > meta-DTD, a bridging form simply says "anything can go here". Thus, rather > than saying the following in the document's declarations: > > > > You would say this in the meta-DTD: > > - - (#PCDATA | ANY)* > > This is essentially the same as what Rick suggested, except that we're > doing it in the meta-DTD, rather than the document's DTD (the document may > not have a DTD). Right. So we haven't really bought any document-level flexibility (which is what I interpreted Peter's request as). We've just a) moved it up a level and b) Provided the option for specializing the element in a "derived" DTD. The former is a bad thing, in that it adds up to more work. We could provide the latter just as well by making element type subclassing a first-class feature of DTDs. > Note that if you have an existing SGML document with an explicit DTD, you > can make that SGML document into a DTD-less XML document simply by using > the existing DTD as an architectural meta-DTD. This removes the necessity > of parsing the declarations with the document any time you want to parse it > without removing the connection between the document and its syntactic and > semantic constraints (thus allowing validation on demand). This is > particularly useful when the DTD you use is large (e.g., Docbook, full TEI, > etc.). XML already removes the necessity of parsing declarations without removing the connection between documents and their syntactic and semantic constraints. The reason we have an RMD is to allow this. So once again we haven't bought anything by making our DTD into an architecture. > But wouldn't it be cool if XML editors *were* architecture aware such that > you could say "I want to create documents that conform to architecture X" > and the editor would determine and enforce the specialization rules, > letting you define new element types (or modify existing ones) and either > warn you when you were doing something outside the architecture or prevent > you from doing something outside the architecture (depending on what your > local specialization policies are)? I think so. In fact, I think this is > the only way you can have a useful XML editor at all. I agree with your direction, but feel that AFDR architectures are poorly suited to this in the long run. By definition, they express constraints on *elements* and not *element types*. That means that you can define an element that behaves according to the architecture 100 times, but on the 101st time you will get a cryptic error message about architectural non-compliance. Worse, that error message could be for a base architecture of a base architecture of a base architecture of the architecture you are familiar with. I don't think that that is what we want. Every *element type* in the "derived" DTD should subclass from a particular *element type* in the base DTD. And the *DTD* should be constrained such that it in turn constrains documents to conformance with the meta-class DTD. Before you make a single instance you should know that there is nothing you can do in the instance that could invalidate any of your base classes. I agree that we need a) more flexible DTDs (#ANY etc.) and b) a mechanism for extending and constraining these flexible DTDs. I do NOT agree that we need a concept of "architectures" to do so. Extending TEI (for example) should be as simple as: %TEI; And the result should a) be a single document type, not a document type and an architecture and b) be guaranteed to constrain documents to TEI conformance. In other words, we need element type subclassing, but we don't have to bring the whole HyTime architecture mechanism to do so. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From msample at opentext.com Fri Oct 17 23:31:24 1997 From: msample at opentext.com (Mike Sample) Date: Mon Jun 7 16:58:40 2004 Subject: Binary Data in XML? Message-ID: <3447D8DB.2BB05CF7@opentext.com> Is there a standard way to include binary data directly within an XML/RDF document? Or must one use indirection or ad-hoc encoding such as: or A103EF92B7 Thanks, Mike Sample xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sat Oct 18 00:02:57 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:40 2004 Subject: Binary Data in XML? References: <3447D8DB.2BB05CF7@opentext.com> Message-ID: <3447E11E.70E36849@technologist.com> Mike Sample wrote: > > Is there a standard way to include binary data directly within an > XML/RDF document? No. >Or must one use indirection or ad-hoc encoding Yes. I think that it is probably a good thing that you can always know that an XML document is completely textual. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tallen at sonic.net Sat Oct 18 00:05:28 1997 From: tallen at sonic.net (Terry Allen) Date: Mon Jun 7 16:58:40 2004 Subject: Re Binary data in XML Message-ID: <199710172205.PAA13187@bolt.sonic.net> Mike Sample writes: | Is there a standard way to include binary data directly within an | XML/RDF document? Or must one use indirection or ad-hoc encoding such | as: | | | | or | A103EF92B7 Not a standard, but you might consider ftp://ds.internic.net/internet-drafts/draft-masinter-url-data-03.txt which expires tomorrow. I don't know whether the author expects to update it. It defines a data URL like data:[][;base64], in which the payload is encoded in base64 and the mediatype is a MIME type. It's SGML/XML-safe. Regards, Regards, Terry Allen Electronic Publishing Consultant tallen[at]sonic.net http://www.sonic.net/~tallen/ Davenport and DocBook: http://www.ora.com/davenport/index.html at CNgroup: terry.allen[at]cngroup.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sat Oct 18 02:25:16 1997 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:58:40 2004 Subject: Binary Data in XML? Message-ID: <3.0.32.19971017172205.00905880@pop.intergate.bc.ca> At 02:30 PM 17/10/97 -0700, Mike Sample wrote: >Is there a standard way to include binary data directly within an >XML/RDF document? Or must one use indirection or ad-hoc encoding such No and yes. But you're not the first person to ask for this. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at ursus.demon.co.uk Sat Oct 18 10:26:04 1997 From: peter at ursus.demon.co.uk (Peter Murray-Rust) Date: Mon Jun 7 16:58:40 2004 Subject: Weak DTDs In-Reply-To: <3447D309.97AB1401@technologist.com> References: <3.0.32.19971017081822.009f1af8@swbell.net> Message-ID: <3.0.1.16.19971018090850.3fbfa37a@pop3.demon.co.uk> At 17:05 17/10/97 -0400, Paul Prescod wrote: > > >%TEI; > > This is the sort of construct that I started with (but using HTML2.0). CML was designed to allow other frequently-used DTDs to be incorporated into a single conventional DTD that would validate any 'CML' document. The namespace syntax had not been addressed then, and gave me some headaches, but even when that is neglected I found difficulties. In essence they could be summarised by: - wishing to insert CML elements within HTML sections (since HTML has very weak support for typed data). - wishing to insert HTML sections within CML elements (e.g the descriptive hypertext for, say, a molecule. In the end the complex rules I devised became unworkable even for me (the author) that I abandoned them. I therefore gave up formal DTD validation. Yesterday evening I converted a typical chemical manuscript into CML including RDF and DC metadata, images, spectra, molecules, bibliography, XML-LINKs to several related XML and non-XML documents, and so on. I found the freedom of NOT having a 'conventional' DTD was very liberating. I believe that (with the latest JUMBO) it displays quite attractively and meaningfully to human readers. So what is the formal value of the document to *non-human* readers? I can see at least the following: - TEI 'searches' of the document (especially with STRING) are very powerful. [BTW, the fact that TEI defines substrings in PCDATA but not in attribute values means that I now favour using subelements rather than attributes. To that extent I think the XML-specs tilt the balance.] I should like to 'extend' the TEI approach to search for more complex fragments (early drafts suggested a FOREIGN keyword, which means that any algorithm can be tacked on). I'd like to keep in step with others here - is there any consensus on a formalised search language for XML documents? - many 'readers' will not need to access all the data in the document, and can reasonably extract small fragments, e.g. DESCENDANT(ALL,PERSON)CHILD(1,VAR,BUILTIN,EMAIL) will locate all the people who have e-mail addresses. - XML-STYLE looks likes being extremely valuable for many document transformations. [In the early days of JUMBO I wrote a lot of horrible code to process and display specific elements, and I now realise this should be done in XML-STYLE. Is anyone else hacking a Java version of XML-STYLE or do I have to do it myself?]. The most common operations on a generic CML document look like being: - display this attractively to a human - search document(s) for particular chunks of information and I then see a role for more specific DTDs for those people who need their documents to conform to specific formats (e.g. regulatory submissions, safety sheets, pharmacopeias, etc.) Hopefully they will pick features out of CML so that the semantics of elements is consistent throughout the community. I am an idealist :-) BTW I am particularly interested in actual implementations of things discussed on this list, or people who are interested in developing them collaboratively. Although XML has come a long way, we are nowhere near having enough examples of tools to convince the rest of the world :-) P. Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic net connection VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary http://www.venus.co.uk/vhg xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Sat Oct 18 11:17:06 1997 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 16:58:40 2004 Subject: Weak DTDs Message-ID: <199710180916.KAA13695@GPO.iol.ie> >W. Eliot Kimber wrote: >> Peter has run head-on into one of the fundamental problems with DTDs as >> currently defined by SGML (and XML): we want them to describe *classes* of >> documents when they actually describe *individual* documents (and are >> incapable of defining classes of documents except in very weak ways). > SGML establishes a one-to-one relationship between markup and DTD. The instance - directly or indirectly - *names* its DTD. However, a given instance can be gainfully parsed w.r.t. many DTDs to achieve different effects - weakening contraints is one of them. To do this involves specifying at parse time - rather that at authoring time - what DTD to use for the parse. HyTime allows parsing w.r.t. a meta-DTD via HyTime aware parsers. However, I think there are many occasions when there is nothing "meta" involved. Just a desire to parse w.r.t to an alternative schema. Not a meta-schema - just a different schema. Sean Mc Grath sean@digitome.com Digitome Electronic Publishing http://www.digitome.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From javellan at ccls.edu Sat Oct 18 14:19:55 1997 From: javellan at ccls.edu (Juan Andres Avellan) Date: Mon Jun 7 16:58:40 2004 Subject: Electronic Product Catalogue Providers Message-ID: <2.2.32.19971018121757.006d0b8c@pipsqueak.ccls.edu> Hello, I'm doing legal research at PhD level concerning intermediaries in electronic commerce and at this moment am working on electronic product catalogues (EPC). Are there any EPCs out there that are working under closed user groups and/or networks that I could contact? Thanks in advance, Juan _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ Juan Andres Avellan _/ Queen Mary and Westfield College _/ PhD Researcher _/ University of London _/ Centre for Commercial Law Studies _/ Email: tl6345@qmw.ac.uk _/ _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sat Oct 18 15:40:16 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:40 2004 Subject: Weak DTDs References: <199710180916.KAA13695@GPO.iol.ie> Message-ID: <3448BCC0.6D7CB02B@technologist.com> Sean Mc Grath wrote: > > HyTime allows parsing w.r.t. a meta-DTD via HyTime aware parsers. However, > I think there are many occasions when there is nothing "meta" involved. Just > a desire to parse w.r.t to an alternative schema. Not a meta-schema - just > a different schema. > That's absolutely true. And ARCHFORMS absolutely allow this. That is why calling them "meta-DTDs" is very misleading. The archform feature allows you to specify a mapping from elements to element types using attributes, rather than GIs. This is very similar to CONREF except that you can use minimization features like #FIXED, default values, #CURRENT, LINK and also some archform specific minimization features to make the mapping without putting an attribute on every element the way you must with GIs. In fact, sometimes you can make the mapping without putting an attribute on any elements at all. Then you can seem to set up a mapping between element types in one DTD and element types in another DTD. Thus you have the illusion of a DTD to DTD relationship rather than an instance to DTD relationship. That's when archforms seem like a "meta-DTD". But in the general case where you ignore minimization, archforms are not really "meta" anything. They are alternate DTDs validated according to attribute values, not GIs. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From peter at techno.com Sat Oct 18 15:51:03 1997 From: peter at techno.com (Peter Newcomb) Date: Mon Jun 7 16:58:40 2004 Subject: Weak DTDs In-Reply-To: <199710180916.KAA13695@GPO.iol.ie> (digitome@iol.ie) Message-ID: <199710181343.JAA21462@exocomp.techno.com> W. Eliot Kimber wrote: > Peter has run head-on into one of the fundamental problems with DTDs as > currently defined by SGML (and XML): we want them to describe *classes* of > documents when they actually describe *individual* documents (and are > incapable of defining classes of documents except in very weak ways). The argument that Eliot is making is that as SGML (and XML) are defined today, given an external declaration subset (the entity identified by the external identifier of doctype declarations), there is no (easy) way to guarantee that documents that reference it actually conform to it, unless those documents' doctype declarations do not include an internal subset. This is because entities, notations, elements, and attributes declared in a document's internal subset can radically alter the document's type: general entities may be redefined, completely unknown notations, element types, and attributes may be added, and parameter entities can be redefined such that notations, element types, and attributes declared in the external subset have completely different definitions. All of these modifications can be made completely without constraint. The only defenses DTD designers have against this all require the DTD to be even more rigid, as any opportunity for flexibility also opens up an opportunity for abuse. Moreover, even these defenses may not be enough. Disallowing the internal subset is not the answer, because it is still needed in order to describe document-level (as opposed to document type-level) characteristics, at least things like document-specific general entities, and configuration control parameter entities (that configure the DTD in predefined ways, through the use of marked sections). Architectures, IMO, are a step in the right direction, since they are immune to the kinds of haphazard modifications that make it difficult to recognize and process a class of documents, while still allowing the document-level flexibility needed by document authors. [Sean Mc Grath on Sat, 18 Oct 1997 09:48:39 +0100] > > HyTime allows parsing w.r.t. a meta-DTD via HyTime aware parsers. However, > I think there are many occasions when there is nothing "meta" involved. Just > a desire to parse w.r.t to an alternative schema. Not a meta-schema - just > a different schema. > It is true that there is nothing "meta" about meta-DTDs. They should be called architectural DTDs instead, where "architectural" means "used via the SGML architecture mechanism defined in Annex A.3 of ISO/IEC 10744:1997", or "designed to be used architecturally", as in the case of the HyTime architecture's DTD. Architectural DTDs are just DTDs being used in a different way. And yes, architectural processing _is_ tantamount to parsing with respect to an alternative schema, only the architectural schema is better protected from the individual needs of documents, and individual documents are better protected from the generalized needs of the architectural schema. -peter -- Peter Newcomb TechnoTeacher, Inc. peter@petes-house.rochester.ny.us peter@techno.com http://www.petes-house.rochester.ny.us http://www.techno.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Sat Oct 18 15:51:53 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:40 2004 Subject: Weak DTDs References: <3.0.32.19971017081822.009f1af8@swbell.net> <3.0.1.16.19971018090850.3fbfa37a@pop3.demon.co.uk> Message-ID: <3448BF6F.280E38A8@technologist.com> Peter Murray-Rust wrote: > Yesterday evening I converted a typical chemical manuscript into CML > including RDF and DC metadata, images, spectra, molecules, bibliography, > XML-LINKs to several related XML and non-XML documents, and so on. I found > the freedom of NOT having a 'conventional' DTD was very liberating. I > believe that (with the latest JUMBO) it displays quite attractively and > meaningfully to human readers. Being without DTDs *is* very liberating for individuals. So is being without written laws. But I don't want to live in a community without laws. In your case, how can you guarantee that the RDF and DC metadata conforms to those systems? How do you make sure that your images specify their required attributes? How do you make sure that your molecules nest in the right places? Are you really willing to make software (even just simple stylesheets) that properly format RDF data in the middle of a molecule, a molecule in the middle of an image, a bibliography entry in the middle of your spectra? Or are you going to fill your code (and stylesheets) with hundreds of ASSERT statements that raise error messages when these elements are found out of their expected context? Admittedly, depending on your exact needs, SGML DTDs may or may not be able to express them, (and XML DTDs are less likely to be able to express them), but in most cases they are the most efficient way of expressing these constraints. But their goal is to protect your code from bogus documents. You can dump them, but now you must do all of the checking yourself. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Sun Oct 19 03:52:06 1997 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 16:58:41 2004 Subject: Scripting and XML Message-ID: XML at this point seems to be exceptionally well written for a model in which the data is passive and gets processed by an outside application - the parser/application combination. It doesn't seem like it will work very well, however, with a model that is rapidly growing more popular in the HTML world: scripts included in the same document as the data. While this blending of data and processor is admittedly a little unusual, it is becoming standard practice more and more often. The W3C's Document Object Model proposals explicitly include XML, leading me to plot out the development of programs that take advantage of this powerful new tool. (Or, at least it will be a powerful new tools when they figure out what it should look like and someone implements it.) The rather gigantic problem I'm having is that scripting languages, including ECMAScript/JavaScript, use all kinds of markup characters. In their context, a < character just means "less than". I suppose I can use inside the SCRIPT tags and hope that the vendors implement this properly, but it would be a heck of a lot easier to be able to declare . I never thought I'd complain about the SGML goodies that got dropped to make XML intelligible to ordinary humans and parsers, but it's happened. This seems like an easy thing to fix, and something that would bring XML more in line with other W3C projects. Any thoughts? Simon St.Laurent Dynamic HTML: A Primer xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Oct 19 05:35:13 1997 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:58:41 2004 Subject: Scripting and XML Message-ID: <3.0.32.19971018203128.008d59f0@pop.intergate.bc.ca> At 12:37 AM 19/10/97 UT, Simon St.Laurent wrote: >but it would be a heck of a lot easier to be able to >declare Well, it would if the SGML CDATA element idea actually worked. But it doesn't, because the contents of a CDATA element are terminated by any <, or is it - anybody who can't implement this properly is a bozo. -T. xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rasmus at lerdorf.on.ca Sun Oct 19 05:51:14 1997 From: rasmus at lerdorf.on.ca (Rasmus Lerdorf) Date: Mon Jun 7 16:58:41 2004 Subject: Scripting and XML In-Reply-To: Message-ID: > XML at this point seems to be exceptionally well written for a model in which > the data is passive and gets processed by an outside application - the > parser/application combination. It doesn't seem like it will work very well, > however, with a model that is rapidly growing more popular in the HTML world: > scripts included in the same document as the data. While this blending of > data and processor is admittedly a little unusual, it is becoming standard > practice more and more often. The W3C's Document Object Model proposals > explicitly include XML, leading me to plot out the development of programs > that take advantage of this powerful new tool. (Or, at least it will be a > powerful new tools when they figure out what it should look like and someone > implements it.) > > The rather gigantic problem I'm having is that scripting languages, including > ECMAScript/JavaScript, use all kinds of markup characters. In their context, > a < character just means "less than". I suppose I can use > inside the SCRIPT tags and hope that the vendors > implement this properly, but it would be a heck of a lot easier to be able to > declare . I never thought I'd complain about the SGML > goodies that got dropped to make XML intelligible to ordinary humans and > parsers, but it's happened. This seems like an easy thing to fix, and > something that would bring XML more in line with other W3C projects. > > Any thoughts? As an author of just such a language, this has been a concern of mine ever since I first read the XML proposal. I posted to this list last week about this as well. The language I wrote is called PHP/FI (see http://php.iquest.net/). It is currently undergoing a complete rewrite, and making sure that I don't clash with XML is a priority. My solution, naiive as it might be, is to make hide my language inside a PI tag. The XML syntax definition for this tag is: PI ::= '' Char*)) '?>' This, to me, says that I don't have to worry about a single '>' nor a '<' inside the tag. It is only a '?>' that could cause me some problems. So, a typical bit of code would look like: \n"; } ?> Now, my language is a server-parsed language, much like Microsoft's ASP and NetScape's LiveWire or server-side JavaScript. That means that the actual browsers out there will never see these tags. However, living at peace with XML is still important because of XML authoring tools. I would like people to be able to create XML with an XML authoring tool that includes my PHP script tags. This is obviously a hack. Just thought it might be informative for you to hear the sort of nasty mutilations that you are going to encounter when XML gets into the hands of ordinary developers who know next to nothing about SGML/XML. If the XML spec could address this issue of embedding scripting languages directly, and provide some guidelines for the developers of these scripting languages, then it would certainly make life easier on everyone. Like it or not, these scripting languages are here, and they are not going to go away. If it is clearly laid out how such a scripting language should co-exist with XML, the amount of future incompatibilities and confusion might be reduced. By the way, a quick look at Microsoft's ASP will reveal that they use <% ... %> tags. How that is going to survive an XML parser, I have no idea. -Rasmus xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tbray at textuality.com Sun Oct 19 06:09:23 1997 From: tbray at textuality.com (Tim Bray) Date: Mon Jun 7 16:58:41 2004 Subject: Scripting and XML Message-ID: <3.0.32.19971018210457.008c8560@pop.intergate.bc.ca> At 11:50 PM 18/10/97 -0400, Rasmus Lerdorf wrote: >My solution, naiive as it might be, is to make hide my language inside a >PI tag. The XML syntax definition for this tag is: ... >So, a typical bit of code would look like: > $result=mysql("db","select passwd from users where id='$cookie'"); ... > ?> The question is, do you regard your script as part of the document or not? If so, you should use Message-ID: > At 11:50 PM 18/10/97 -0400, Rasmus Lerdorf wrote: > >My solution, naiive as it might be, is to make hide my language inside a > >PI tag. The XML syntax definition for this tag is: > ... > >So, a typical bit of code would look like: > > > $result=mysql("db","select passwd from users where id='$cookie'"); > ... > > ?> > > The question is, do you regard your script as part of the document > or not? If so, you should use appropriate and being used exactly as designed. -Tim I suppose that depends on what you mean by, "being part of the document". I do not regard the script as being part of the information the document is trying to convey. The script can however generate output that will be part of the final document. But, an XML parser that understood PHP processing instructions would be needed to grok that. My only real goal is to make sure that an XML parser will not spew thousands at syntax errors at me when I run an XML file full of PHP script tags through it. And, if it can be parsed without errors, then I expect that an XML authoring system would happily let people add these tags in some sort of escaped or raw, insert your own tag, sort of mode. -Rasmus xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sun Oct 19 09:27:21 1997 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:58:41 2004 Subject: Weak DTDs Message-ID: <199710190734.RAA06224@jawa.chilli.net.au> ---------- > From: Paul Prescod > Peter Murray-Rust wrote(?): > > Yesterday evening I converted a typical chemical manuscript into CML > > including RDF and DC metadata, images, spectra, molecules, bibliography, > > XML-LINKs to several related XML and non-XML documents, and so on. I found > > the freedom of NOT having a 'conventional' DTD was very liberating. I > > believe that (with the latest JUMBO) it displays quite attractively and > > meaningfully to human readers. > > Being without DTDs *is* very liberating for individuals. So is being > without written laws. But I don't want to live in a community without > laws. Actually, I think it is different from that. I think Peter is, to some extent just using ad hoc combinations of DTD fragments. The important laws are written. By the time you make the decision XML, XLL, CML, RDF with HTML element types for general text, the DTD has almost written itself. That it is not explicit is not of so great interest, in that the fragments are explicitly defined and available. The absense of an explicit DTD for them suggests that there are no additional constraints to be imposed on them, and that the fragments can be compbined as freely as possible. I think the future of XML DTD production will largely just be in this recombining existing DTD fragments ("element type sets" or "microdocument DTDs" or "little languages" or even "architectures") in various ad hoc ways. A cookbook or pattern approach. -ricko Rick Jelliffe The SGML Cookbook: Document Patterns for SGML and XML xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sun Oct 19 09:28:18 1997 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:58:41 2004 Subject: Scripting and XML Message-ID: <199710190735.RAA06256@jawa.chilli.net.au> > From: Rasmus Lerdorf > My solution, naiive as it might be, is to make hide my language inside a > PI tag. ... > This is obviously a hack. No, it is neither. XML has explicit delimiters to clearly mark up processing instructions. You are using them for what they are intended. A different take on the issue is this. The SGML/XML model is elements/entities(e.g. resources)/processing instructions. XML piggybacks everything on top of element structure. This is distinct from, e.g., PDF which piggybacks element structure (such as it has) on top of processing. -ricko xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From digitome at iol.ie Sun Oct 19 11:46:40 1997 From: digitome at iol.ie (Sean Mc Grath) Date: Mon Jun 7 16:58:41 2004 Subject: Scripting and XML Message-ID: <199710190946.KAA14512@GPO.iol.ie> >[Simon St.Laurent] >>but it would be a heck of a lot easier to be able to >>declare > [Tim Bray] >Well, it would if the SGML CDATA element idea actually worked. But it >doesn't, because the contents of a CDATA element are terminated by >any <, or is it in practical terms, use CDATA elements for anything useful. If you >could, they'd be in XML. > >For now, stick with - anybody who can't implement >this properly is a bozo. -T. > Steady on! Marked sections have their own magic terminator "]]>". The is no foolproof mechanism involving a magic terminator string comments,. PIs, CDATA MS, CDATA declared content all suffer to varying degrees. The Unix HEREIS document concept adddresses similar problems. It allows the terminating string to be specified on a case by case basis. If SGML allowed such a specification at the start of a CDATA marked section then it would be possible to avoid the premature termination issues without resorting to entity based evisceration of the magic terminiating string. Sean Mc Grath sean@digitome.com Digitome Electronic Publishing http://www.digitome.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Oct 19 13:11:15 1997 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:58:41 2004 Subject: Scripting and XML In-Reply-To: References: Message-ID: <199710191110.HAA00239@unready.microstar.com> Simon St.Laurent writes: > XML at this point seems to be exceptionally well written for a > model in which the data is passive and gets processed by an outside > application - the parser/application combination. It doesn't seem > like it will work very well, however, with a model that is rapidly > growing more popular in the HTML world: scripts included in the > same document as the data. While this blending of data and > processor is admittedly a little unusual, it is becoming standard > practice more and more often. [Remainder omitted] One reason that I've never tried JavaScript (other than the security holes) is that -- as far as I can tell -- there's no way to put the code in a separate file from the HTML page. There are enormous advantages to maintaining processing code separately from markup: 1) you can change processing strategies (or even the code language) without having to rewrite your documents; 2) the code is easier to maintain; 3) you can reuse the same code for dozens (or even thousands) of different documents; and 4) you can use your documents for more than one purpose. With CSS, for example, it makes much more sense to write a single, separate stylesheet for your whole web site than it does to embed a separate stylesheet in each document -- in programming terms, it's the equivalent of using a subroutine instead of writing identical code dozens of times. While it is entirely legitimate to use PI's to embed code, it is nice to keep the code as abstract as possible; for example, instead of I'd prefer something like this in the document: and then something like this in a separate source-code file: declare query dpquery { select * from people where name='david' } That way, I can find bugs more quickly, and I can change query strategies (say, by switching from SQL to an OODB query language) without modifying my documents. Normally, I recommend including code within the document itself only when the code is logically part of the document -- in literate programming, or when the source code is included as an example. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Sun Oct 19 15:18:41 1997 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 16:58:41 2004 Subject: Scripting and XML Message-ID: This responds to a lot of the excellent points brought up by various authors in the last twelve hours. >The question is, do you regard your script as part of the document >or not? If so, you should use appropriate and being used exactly as designed. -Tim Coming from the HTML world, I do regard the script as an integral part of the document. Many web developers are now creating documents that require scripts to be meaningful. Unfortunately, is not a sequence that should appear in too many ECMAScript documents. PIs feel like calls to external processing for me - while it is possible to describe the scripting engine as an external processor, this doesn't recognize the growing importance of documents that carry internal code. The word 'document' has received some heavy redefinition on the web in the last year, with more changes coming every day. 'Live' content is becoming more and more common - inside the HTML document as well as in the process generating it. At present, most SCRIPT tags in use on the Internet already have to hide their code inside comments to avoid spilling their contents across the screens of older browsers. This just seems to add an extra layer of detritus to scripts. While it's workable, it doesn't seem like an elegant way to interoperate with another key set of web standards. As scripting and markup grow more and more intertwined (i.e., the document object model becomes a reality), I suspect this is going to be at least an eyesore if not a roadblock. Scripting is creeping out beyond the browser, so a wide variety of documents are in for a fun time. >Well, it would if the SGML CDATA element idea actually worked. But it >doesn't, because the contents of a CDATA element are terminated by >any <, or is it in practical terms, use CDATA elements for anything useful. If you >could, they'd be in XML. (Tim) What were they thinking? Too bad, because I do enjoy running my XML documents through SGML parsers. It seems like XML's requirement for end tags should have gotten us out of that mire, making a CDATA element run until it hits the actual end tag, without any premature termination. Legacy standards... >XML piggybacks everything on top of element structure. (Rick) So did HTML, creating this problem. The HTML 3.2 DTD declares the SCRIPT element as PCDATA, which I guess means the browser developers aren't playing by real (SGML) parsing rules. >By the way, a quick look at Microsoft's ASP will reveal that they use ><% ... %> tags. How that is going to survive an XML parser, I have no >idea. (Rasmus) Can't say I ever liked <%...%>. Netscape's SERVER element always seemed like a better idea, though it now has the same problems as SCRIPT. >One reason that I've never tried JavaScript (other than the security >holes) is that -- as far as I can tell -- there's no way to put the >code in a separate file from the HTML page. (David) This is actually quite easy, and I do it all the time. . Making this valid XML isn't very difficult, fortunately. In practice, however, this is normally used to bring in library files, which are then applied with code more specific to the contents of the page. An abstract DOM that makes it easier to address page content (and pages marked up more meaningfully) may make this a stronger solution. Back to the trenches. Simon St.Laurent Dynamic HTML: A Primer xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Ingo.Macherius at TU-Clausthal.de Sun Oct 19 17:29:51 1997 From: Ingo.Macherius at TU-Clausthal.de (Ingo Macherius) Date: Mon Jun 7 16:58:41 2004 Subject: Scripting and XML In-Reply-To: References: Message-ID: <199710191529.RAA24302@sinfonix.rz.tu-clausthal.de> > Date: Sat, 18 Oct 1997 23:50:08 -0400 () > From: Rasmus Lerdorf > Subject: Re: Scripting and XML > So, a typical bit of code would look like: > > $result=mysql("db","select passwd from users where id='$cookie'"); > if(mysql_result($result,"passwd")==crypt($input)) { > echo "Welcome $id
\n"; > } > ?> What makes me nervous about the above code is that it is kind of self modifying code. The script replaces itself with it's own output, which may insert tags ("
"). So the document's structure may change after processing the PI, and it may no longer satisfy the DTD after that. > I would > like people to be able to create XML with an XML authoring tool that > includes my PHP script tags. That is ok for the authoring tools, but does say nothing about the validity of XML documents that containined PHP/FI PI after delivery. And what about XLinks ? Pointing at a document containing e.g. a table of data created by querying a DB with PHP/FI is impossible. You would only be able to point at a created element, if your server is aware if it was the output of PIs. This collides with the "just in time" translation of PHP/FI PIs, as the DB may have changed in between. So PHP/FI could parse correctly before, but probably not after delivery. A possible solution is to batch-process your documents and validate after that, but before you put them online. ++im -- Ingo Macherius // L'Aigler Platz 4 // D-38678 Clausthal-Zellerfeld mailto:Ingo.Macherius@tu-clausthal.de http://www.tu-clausthal.de/~inim/ Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Frank Zappa) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rasmus at lerdorf.on.ca Sun Oct 19 18:38:06 1997 From: rasmus at lerdorf.on.ca (Rasmus Lerdorf) Date: Mon Jun 7 16:58:41 2004 Subject: Scripting and XML In-Reply-To: <199710191110.HAA00239@unready.microstar.com> Message-ID: > Normally, I recommend including code within the document itself only > when the code is logically part of the document -- in literate > programming, or when the source code is included as an example. Well, in the PHP scripting language you can keep it separate if you want. A tag such as: does the trick. -Rasmus xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rasmus at lerdorf.on.ca Sun Oct 19 18:46:07 1997 From: rasmus at lerdorf.on.ca (Rasmus Lerdorf) Date: Mon Jun 7 16:58:41 2004 Subject: Scripting and XML In-Reply-To: <199710191529.RAA24302@sinfonix.rz.tu-clausthal.de> Message-ID: > So PHP/FI could parse correctly before, but probably not > after delivery. A possible solution is to batch-process your > documents and validate after that, but before you put them online. I recognize that there is really no way to ensure that the final output will be valid XML. But, batch-processing is just not going to happen. The whole point of a server-parsed html-embedded language is that it is dynamic. In the little example I posted, the output depends on the identity of the person viewing the page. There is no way to batch-process every possible variation of a dynamic page's output. -Rasmus xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Sun Oct 19 19:40:34 1997 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:58:41 2004 Subject: Scripting and XML Message-ID: <199710191747.DAA20315@jawa.chilli.net.au> > From: Simon St.Laurent > >Well, it would if the SGML CDATA element idea actually worked. But it > >doesn't, because the contents of a CDATA element are terminated by > >any <, or is it >in practical terms, use CDATA elements for anything useful. If you > >could, they'd be in XML. (Tim) > > What were they thinking? The CDATA idea does work for what it is intended to be used for. Text with no subelements, entity references or other markup. In default SGML, a CDATA element's data is terminated by " >XML piggybacks everything on top of element structure. (Rick) > > So did HTML, creating this problem. SGML piggybacks PIs on top of element structure because 1) It is not a programming language but a markup language. 2) Previous systems that did embed data inside programming constructs failed to scale or to be readily adaptable to different uses and media. Rick Jelliffe xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From Ingo.Macherius at TU-Clausthal.de Sun Oct 19 19:49:13 1997 From: Ingo.Macherius at TU-Clausthal.de (Ingo Macherius) Date: Mon Jun 7 16:58:41 2004 Subject: Scripting and XML In-Reply-To: References: <199710191529.RAA24302@sinfonix.rz.tu-clausthal.de> Message-ID: <199710191748.TAA26954@sinfonix.rz.tu-clausthal.de> > Date: Sun, 19 Oct 1997 12:45:04 -0400 () > From: Rasmus Lerdorf > Subject: Re: Scripting and XML > > So PHP/FI could parse correctly before, but probably not > > after delivery. > I recognize that there is really no way to ensure that the final output > will be valid XML. Yes there is: That's to make PHP/FI a conforming XML application. By now it is a turing complete, server side language that just allows one to omit a "print" or "echo" statement before HTML markup if outside a PI. If PHP/FI itself would parse documents than rather replacing string A with string B, there would be full control over output. I understand that doing so would break backward compatibility. > The whole point of a server-parsed html-embedded language is that it is > dynamic. The Roxen Server, which basically allows server-side processing similar to PHP/FI, uses specialized tags. This is dynamic, but one could give a "PHP/FI DTD". If it's modular, it can be mixed with any XML DTD. Whatever becomes of the namespaces/architectural forms discussion, it would work with any DTD. What can be done with PHP/FI that can't be done with XML is interacting with the httpd and external databases. Inserting text or do simple computations, will become the domain of client side processing languages. And don't forget there are entities with XML, so mere string relacement is no longer a challange. And because entities could be served by URL that are CGI (or whatever), they can become quite dynamic. PHP/FI used to solve many problems that existed due to shortcomings in HTML. Now with XML, many of those are gone. But XML uses different ways than PHP/FI did. IMHO there are two possible solutions: Redesign PHP/FI to become XML compliant (which is not backward compatible), or just see it as a special Apache scripting language (where I like mod_perl more). ++im -- Ingo Macherius // L'Aigler Platz 4 // D-38678 Clausthal-Zellerfeld mailto:Ingo.Macherius@tu-clausthal.de http://www.tu-clausthal.de/~inim/ Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Frank Zappa) xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From schampeo at hesketh.com Sun Oct 19 20:28:57 1997 From: schampeo at hesketh.com (Steven Champeon) Date: Mon Jun 7 16:58:41 2004 Subject: Scripting and XML In-Reply-To: <199710191110.HAA00239@unready.microstar.com> References: Message-ID: <3.0.3.32.19971019142829.00745084@mail.imvi.com> At 07:10 AM 10/19/97 -0400, David Megginson graced us with: > One reason that I've never tried JavaScript (other than the security > holes) is that -- as far as I can tell -- there's no way to put the > code in a separate file from the HTML page. There are enormous > advantages to maintaining processing code separately from markup: I know that SGML purists cringe at the sight of this, but Server Side Includes are a useful mechanism for many such instances: title What bothers me, though is that this only covers the function definitions or onLoad stuff. To associate these functions with the markup means you have to embed event handlers in the appropriate elements, which sucks. So, major code reuse akin to headers in C/C++ is possible if ugly. But minor recycing of code is impossible for now. MSIE4 introduces a new form of script object handling, "scriptlets", which addresses this problem to some extent, but any useful script component would need to be extended to fit any given situation - the same problem which drives OOA&D/OOP - so it has yet to be seen how useful this scriptlet idea will be. Steve -- Steven Champeon | It is very dark. You are http://www.hesketh.com/schampeo/ | likely to be eaten by a grue. http://www.jaundicedeye.com | - Zork xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Sun Oct 19 21:23:08 1997 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:58:42 2004 Subject: Scripting and XML In-Reply-To: References: Message-ID: <199710191922.PAA00200@unready.microstar.com> Simon St.Laurent writes: > The word 'document' has received some heavy redefinition on the web > in the last year, with more changes coming every day. 'Live' > content is becoming more and more common - inside the HTML document > as well as in the process generating it. This is a very good point; it is important to note, however, that in XML, as in full SGML, a "document" does not necessarily consist of a single file. To my knowledge, no one has ever seriously tried to include GIF or JPEG images inside an HTML file; we all accept that they exist in separate files/entities, even though they form part of the same document in the user's browser. Why should non-XML text-based data, like scripts, not be treated the same way as non-XML binary data? In compound documents, XML or full SGML provides two things: 1) a high-level schema to show how the compound document fits together (the entity structure); and 2) a method for presenting structural information (the element structure). It certainly makes sense to include scripts in compound documents, but I cannot see the advantage of mixing them in with the XML markup itself -- they are a lot cleaner and easier to maintain when they are in a separate file. By the way, thanks for the tip about using JavaScript in separate files -- I will take the time to give it a proper look. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rasmus at lerdorf.on.ca Sun Oct 19 21:43:54 1997 From: rasmus at lerdorf.on.ca (Rasmus Lerdorf) Date: Mon Jun 7 16:58:42 2004 Subject: Scripting and XML In-Reply-To: <199710191748.TAA26954@sinfonix.rz.tu-clausthal.de> Message-ID: > PHP/FI used to solve many problems that existed due to shortcomings > in HTML. Now with XML, many of those are gone. But XML uses > different ways than PHP/FI did. IMHO there are two possible > solutions: Redesign PHP/FI to become XML compliant (which is not > backward compatible), or just see it as a special Apache > scripting language (where I like mod_perl more). It is not Apache-specific like mod_perl though. PHP3 runs natively under Windows as well now, and has both NSAPI and ISAPI support. There will even be Unix-based ISAPI support to go along with the ISAPI-capable Zeus server. mod_perl is a very handy. I use it myself quite a bit. But there are a lot of shortcomings, not the least of which being the Perl language itself which can be extremely daunting to a non-programmer. PHP is evolving. PHP3 is not completely backward compatible with PHP2 and earlier. If I am going to make the jump to a full XML parser, then now is probably a good time to do it. However, given the target audience of PHP, forcing people to think in XML terms might be quite a stretch. -Rasmus xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From rasmus at lerdorf.on.ca Sun Oct 19 21:59:26 1997 From: rasmus at lerdorf.on.ca (Rasmus Lerdorf) Date: Mon Jun 7 16:58:42 2004 Subject: Scripting and XML In-Reply-To: <199710191922.PAA00200@unready.microstar.com> Message-ID: > It certainly makes sense to include scripts in compound documents, but > I cannot see the advantage of mixing them in with the XML markup > itself -- they are a lot cleaner and easier to maintain when they are > in a separate file. The advantage is when the script is extremely simple, having to put a little script snippet in another file is a hassle. ie. > This $value might be coming from an SQL engine, LDAP, a socket connection or just about anything imaginable, but all I need to do at this point in my HTML is display the $value variable. This might be a long HTML file with only that one single tag. In traditional web development you would write the entire thing in Perl and have perl print out lines and lines of straight HTML just so you could display that one dynamic variable in the right place. This means that when changes to the HTML needs to be made, you need to have your HTML editing person understand Perl as opposed to just have the person use an HTML authoring tool. There is obviously a delicate balance between the two methods. If the page is mostly scripting, then you would be better off writing it directly in the scripting language and have the language output the required HTML. But when it is mostly HTML with the odd bit of scripting needs, then it makes most sense to a lot of people to just toss the magical scripting tags directly into the HTML file. -Rasmus xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Oct 20 00:18:30 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:42 2004 Subject: Scripting and XML References: Message-ID: <344A87AF.6E534A13@technologist.com> Simon St.Laurent wrote: > > At present, most SCRIPT tags in use on the Internet already have to hide their > code inside comments to avoid spilling their contents across the screens of > older browsers. This just seems to add an extra layer of detritus to scripts. > While it's workable, it doesn't seem like an elegant way to interoperate with > another key set of web standards. As scripting and markup grow more and more > intertwined (i.e., the document object model becomes a reality), I suspect > this is going to be at least an eyesore if not a roadblock. Actually, I think that the situation is the opposite. The document object model finally allows you to refer to document elements from *outside* your document so that you have *less need* to directly mix scripts and code. Using DOM I can create a client-side program that takes an XML instance as input and returns XML as output. I can't do that with JavaScript as it exists today. The JavaScript "model" is textual replacement (which must, by definition, be "inline"). The DOM model is structural processing (which can be done "remotely"). I expect that a few years from now this convention of putting markup and scripting cheek to cheek will have died out. It is just another face of the logical markup vs. presentational markup war The trend is from inline to external, just as with presentation. Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simeons at allaire.com Mon Oct 20 00:58:05 1997 From: simeons at allaire.com (Simeon Simeonov) Date: Mon Jun 7 16:58:42 2004 Subject: Scripting and XML Message-ID: <01bcdce3$07fefe90$4a15b5cd@sim.allaire.com> Disclaimer: I'm new to SGML and XML... >So, major code reuse akin to headers in C/C++ is possible if ugly. But >minor recycing of code is impossible for now. MSIE4 introduces a new form >of script object handling, "scriptlets", which addresses this problem to >some extent, but any useful script component would need to be extended >to fit any given situation - the same problem which drives OOA&D/OOP - >so it has yet to be seen how useful this scriptlet idea will be. I think there is a method for code reuse that is (a) more flexible than includes, (b) as powerful as object-based reuse (scriptlets), and (c) a lot closer to the syntax of generalized markup. Essentially, it involves reuse through flexible tag structures. I'd like to start with an example from my domain of expertise--web applications. (Credit goes to Len Bullard because the example emerged from an offline discussion about XML Scripting...) Consider the following piece of code (ignore actual tag syntax, etc.): What do you do if you decide that this piece of code is useful and should be reused? Includes won't work well, because the result of processing depends on the value of argument. Scripts will force you to modify the code before reuse, e.g., Then you'll put the body of the SCRIPT tag in a procedure/function... That's somewhat unpleasant. Code reuse shouldn't mean code re-write! What is really needed is a mechanism which allows (a) automatic reuse of tags *and* scripts with little code changes, and (b) safe argument passing. The language that I am working on (CFML - the Cold Fusion Markup Language from Allaire Corp.) uses a flexible tag-based mechanism for this. The example code needs to modified only slightly to indicate which variables are inputs and then needs to be placed in a separate file. File DoSomething.cfm: It can then be invoked in one of three ways: (Please, ignore the exact details of the naming scheme.) I think that CFML's approach is (a) significantly better than includes, (b) as powerful as scriptlets, and (c) a lot more markup friendly. (In general, CFML is quite tag-friendly; the language uses tags for flow-control, expression evaluation, etc.) Also, code reuse involves less code modification than other approaches that I know of. Don't get me wrong. This is not the exact syntax I am recommending for XML! I just wanted to point out one case where scripting and code reuse have been successfully addressed with markup. There are tens of thousands of CFML developers that strongly prefer this approach to a stew of markup and script. In a markup language shouldn't the unit of code reuse be a tag??? I hope I have not bored you with the rather long message... If you'd like to look at CFML and the Cold Fusion Application Server, check out http://www.allaire.com. Regards, Simeon Simeonov Language Architect Allaire Corp. simeons@allaire.com xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Mon Oct 20 01:00:13 1997 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 16:58:42 2004 Subject: Scripting and XML Message-ID: >I expect that a few years from now this convention of putting markup and >scripting cheek to cheek will have died out. It is just another face of >the logical markup vs. presentational markup war. The trend is from >inline to external, just as with presentation. I agree with you on presentation, but I'll argue quite heartily that scripting is becoming an integral part of content. >The document >object model finally allows you to refer to document elements from >*outside* your document so that you have *less need* to directly mix >scripts and code. Using DOM I can create a client-side program that >takes an XML instance as input and returns XML as output. I can't do >that with JavaScript as it exists today. The JavaScript "model" is >textual replacement (which must, by definition, be "inline"). The DOM >model is structural processing (which can be done "remotely"). Which _can_ be done remotely - but that isn't to say that remote control is always the best solution. OOP ticked off a lot of people when it first appeared for suggesting that data and code might work better as a unit than as separate parts, and I suspect JavaScript (though it's hardly OO) is going to take knocks for a similar offense. I'm not completely sure where you're coming from declaring the "JavaScript 'model' is textual replacement" - while textual replacement is one part of the JavaScript toolset, it's hardly the only piece. As much as I hate to use them as an example, Microsoft's scriptlets are a strong step in the opposite direction of what you would like to see. Scriptlets combine a small amount of code and some markup to create an interface component that can be added easily to a page. If anything, the trend (in my feeble opinion) is toward further mixing of code and markup, not less. Scriptlets can be quick hacks, or they can be elaborate interface components. As I said before, we'll see what people actually do with the stuff soon enough. Simon St.Laurent Dynamic HTML: A Primer xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From simeons at allaire.com Mon Oct 20 01:12:54 1997 From: simeons at allaire.com (Simeon Simeonov) Date: Mon Jun 7 16:58:42 2004 Subject: Scripting and XML Message-ID: <01bcdce5$187280b0$4a15b5cd@sim.allaire.com> >Actually, I think that the situation is the opposite. The document >object model finally allows you to refer to document elements from >*outside* your document so that you have *less need* to directly mix >scripts and code. Using DOM I can create a client-side program that >takes an XML instance as input and returns XML as output. I can't do >that with JavaScript as it exists today. The JavaScript "model" is >textual replacement (which must, by definition, be "inline"). The DOM >model is structural processing (which can be done "remotely"). > >I expect that a few years from now this convention of putting markup and >scripting cheek to cheek will have died out. It is just another face of >the logical markup vs. presentational markup war The trend is from >inline to external, just as with presentation. > This is a good direction. However, I see a potential inconvenience for scripts that directly modify the document. Embedded scripts implicitly identify the part of the document they operate on with their position. External scripts will have to explicitly specify the part they operate on. Regards, Simeon Simeonov xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From john at datachannel.com Mon Oct 20 01:28:16 1997 From: john at datachannel.com (John Tigue) Date: Mon Jun 7 16:58:42 2004 Subject: SLIDES: XML and WebComputing: XML Enabled Mechanisms for Distributed Computing on the Web Message-ID: <344A986C.2AE54196@datachannel.com> I want to thank Robin Tomlin of SGML Open for the opportunity to speak. http://xml.datachannel.com/public/presentation/DocumationEast contains the "slides" from a presentation I gave at Documation East. It is a single 25 K HTML file. The presentation is also available at http://www.capv.com/doc97epresos/DE97M.html. -- John Tigue Sr. Software Architect DataChannel http://www.datachannel.com jtigue@datachannel.com 206-462-1999 -------------- next part -------------- A non-text attachment was scrubbed... Name: vcard.vcf Type: text/x-vcard Size: 263 bytes Desc: Card for John Tigue Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19971020/0b4f594b/vcard.vcf From ak117 at freenet.carleton.ca Mon Oct 20 03:10:30 1997 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:58:42 2004 Subject: Scripting and XML In-Reply-To: <01bcdce5$187280b0$4a15b5cd@sim.allaire.com> References: <01bcdce5$187280b0$4a15b5cd@sim.allaire.com> Message-ID: <199710200109.VAA00195@unready.microstar.com> Simeon Simeonov writes: [...] > This is a good direction. However, I see a potential inconvenience > for scripts that directly modify the document. Embedded scripts > implicitly identify the part of the document they operate on with > their position. External scripts will have to explicitly specify > the part they operate on. Not necessarily -- the linking can go either way. The document itself may contain markup selecting the code (as in my earlier example), or the code may select part of the document to work on (say, by using an ID or a treeloc). All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Mon Oct 20 06:01:19 1997 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 16:58:42 2004 Subject: Scripting and XML Message-ID: It seems like the complexities of SGML that XML stripped away are still haunting XML. >The CDATA idea does work for what it is intended to be used for. >Text with no subelements, entity references or other markup. >In default SGML, a CDATA element's data is terminated by "followed by any valid name start character (or the end of the >entity). In XML, there is no name-start checking and every >start-tag must have a corresponding end-tag. Excellent. Now we know what the SGML developers were thinking - now we just need to figure out why this is relevant to XML. Why is it so difficult to create CDATA elements - which have to be marked clearly in XML by start and end tags? There is no need in XML to stop CDATA at just any as >, and & as &. Should make for some very readable code. Simon St.Laurent Dynamic HTML: A Primer xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Oct 20 06:59:43 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:42 2004 Subject: Scripting and XML References: <01bcdce5$187280b0$4a15b5cd@sim.allaire.com> Message-ID: <344AE5BF.13B40DC5@technologist.com> Simeon Simeonov wrote: > This is a good direction. However, I see a potential inconvenience for > scripts that directly modify the document. Embedded scripts implicitly > identify the part of the document they operate on with their position. > External scripts will have to explicitly specify the part they operate on. More likely, I expect the opposite. The parts of the scripts that need to be "operated upon" will identify themselves just as they do for presentation. Let's say you have a form field that validates sin numbers (in Canada, we report sins to the government at least once a year). You shouldn't have to say this: nor even this: But rather this: or even this: To me this seems blindingly analogous to the move from this: to this: to this: to this: . Paul Prescod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Oct 20 07:21:58 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:42 2004 Subject: Scripting and XML References: Message-ID: <344AEADE.914C1CE1@technologist.com> Simon St.Laurent wrote: > > >I expect that a few years from now this convention of putting markup and > >scripting cheek to cheek will have died out. It is just another face of > >the logical markup vs. presentational markup war. The trend is from > >inline to external, just as with presentation. > > I agree with you on presentation, but I'll argue quite heartily that scripting > is becoming an integral part of content. I agree with you. Two years ago formatting was becoming an "integral part of content." Today we are extricating it. Hopefully it won't take two years to do the same with scripting, but it might. > Which _can_ be done remotely - but that isn't to say that remote control is > always the best solution. OOP ticked off a lot of people when it first > appeared for suggesting that data and code might work better as a unit than as > separate parts, and I suspect JavaScript (though it's hardly OO) is going to > take knocks for a similar offense. Luckily, the situations are not analogous. When is the last time you tried to full-text index a datastructure in your program? When is the last time you tried to edit one in a WYSIWYG editor? The reason you want to move proccessing (whether it is formatting, JavaScript, or whatever) out of documents is because the more processing you have in your document, the less useful it is to tools other than the one that the processing is intended for. I can write a document like this: document.write( "" ) document.write( "My document" ) document.write( "" ) But good luck indexing it properly! Have you ever done an AltaVista search and got a hit like this: 16. JavaScript Games - MineSweeper 0) && (y1>=0) && (x1. =0) && (leftValue>=0)) markSquare (bottomValue, leftValue) if ((topValue>=0) && (rightValue>=0)) openSquare(topValue, rightValue) }.. http://www.geocities.com/Wellesley/2159/minesweeper.html - size 13K - 13-Aug-97 - English This isn't just a bug. It's a natural consequence of putting processing information in a document. The multipurpose usefulness of a document is generally inversely proportional to the amount of processing hacks in it. > I'm not completely sure where you're > coming from declaring the "JavaScript 'model' is textual replacement" - while > textual replacement is one part of the JavaScript toolset, it's hardly the > only piece. How can I, in Javascript/Netscape 4.0 change the value of a text element in my document by ID or element type, rather than by "textual replacement" of JavaScript code by HTML code? The JavaScript model for building HTML documents is textual replacement, not structural assembly. > As much as I hate to use them as an example, Microsoft's scriptlets are a > strong step in the opposite direction of what you would like to see. > Scriptlets combine a small amount of code and some markup to create an > interface component that can be added easily to a page. Scriptlets allow the componentization of scripting code so that you can put less of it in your actual document and more of it in external scripts. That seems to me to be a step in the right direction. A scriptlet is a document in name only -- really it is a JavaScript applet. I have no problem with that. They aren't meant to be indexed, edited in WYSIWYG editors, formatted for print, added to website table of contents or any of the other things that we expect to do with real documents. > As I said before, we'll see what people actually do with the stuff soon > enough. Sure. And we'll see again what they decide to do after they have had some more experience with the headaches of intermingling the two. Paul Presscod xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From richard at light.demon.co.uk Mon Oct 20 12:19:54 1997 From: richard at light.demon.co.uk (Richard Light) Date: Mon Jun 7 16:58:42 2004 Subject: AFs and linking (was Weak DTDs) In-Reply-To: <199710181343.JAA21462@exocomp.techno.com> Message-ID: In message <199710181343.JAA21462@exocomp.techno.com>, Peter Newcomb writes >It is true that there is nothing "meta" about meta-DTDs. They should >be called architectural DTDs instead, where "architectural" means >"used via the SGML architecture mechanism defined in Annex A.3 of >ISO/IEC 10744:1997", or "designed to be used architecturally", as in >the case of the HyTime architecture's DTD. Architectural DTDs are >just DTDs being used in a different way. > >And yes, architectural processing _is_ tantamount to parsing with >respect to an alternative schema, only the architectural schema is >better protected from the individual needs of documents, and >individual documents are better protected from the generalized needs >of the architectural schema. Does this mean that architectural processing can be used to deal with an issue that was raised some time ago, echoes of which I can see in the current discussion? That is the ability to pull in 'foreign' documents (or parts of documents) using XLL, and to then treat them, temporarily, as though they were part of the current document. This would be using the architectural DTD like a relational View - "a virtual table that does not really exist ... but looks to the user as if it did". To take an example from my own area of interest (museum information), I want to store information about objects, people, places, events, etc. in separate documents to avoid redundancy. Instead of repeating a few (randomly-)selected details about a person each time they are mentioned in an object record, I will just provide an XLL link to their full biographical record: Made by between 1925 and 1927 ... However, this means that I then need a mechanism to traverse the link and pull in some information from the bio record - the person's name would be a good start! What I want is to be able to treat the PRODUCTION element (on this occasion) as though it contained a PERSON element populated with information from the bio: Made by Ernest Jones1896< /birth> between 1925 and 1927 ... On other occasions I might want more (or different) information from the linked biographical document. This could be achieved by employing a different architectural DTD. I'm sure (?) that this selection of elements from linked documents can be done with XSL, but having support for 'views' would: - separate out structural issues ('transformation') from presentation issues ('style'); - simplify the resultant XSL specs; - offer the client more flexibility over what they did with the resulting virtual document, since it would still be an XML document rather than an XSL output. Any thoughts, anyone? Richard Light. Richard Light SGML/XML and Museum Information Consultancy richard@light.demon.co.uk xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ak117 at freenet.carleton.ca Mon Oct 20 12:44:25 1997 From: ak117 at freenet.carleton.ca (David Megginson) Date: Mon Jun 7 16:58:42 2004 Subject: AFs and linking (was Weak DTDs) In-Reply-To: References: <199710181343.JAA21462@exocomp.techno.com> Message-ID: <199710201042.GAA00227@unready.microstar.com> Richard Light writes: > Does this mean that architectural processing can be used to deal > with an issue that was raised some time ago, echoes of which I can > see in the current discussion? That is the ability to pull in > 'foreign' documents (or parts of documents) using XLL, and to then > treat them, temporarily, as though they were part of the current > document. I suppose that AFs could be used for that, but it doesn't really seem necessary -- the second document can be parsed against its own DTD in XML, just as SUBDOCs are in full SGML, and then the processor can simply merge the two groves on output as specified by the XLL markup in the master document. > This would be using the architectural DTD like a relational View - > "a virtual table that does not really exist ... but looks to the > user as if it did". Exactly. As in SQL, it is a processing problem. All the best, David -- David Megginson ak117@freenet.carleton.ca Microstar Software Ltd. dmeggins@microstar.com http://home.sprynet.com/sprynet/dmeggins/ xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From tms at ansa.co.uk Mon Oct 20 13:44:57 1997 From: tms at ansa.co.uk (Toby Speight) Date: Mon Jun 7 16:58:42 2004 Subject: Scripting and XML In-Reply-To: "Simon St.Laurent"'s message of Sun, 19 Oct 97 18:01:49 UT References: Message-ID: A non-text attachment was scrubbed... Name: not available Type: text/plain (pgp signed) Size: 2367 bytes Desc: not available Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19971020/df9740de/attachment.bin From SimonStL at classic.msn.com Mon Oct 20 14:51:14 1997 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 16:58:43 2004 Subject: Scripting and XML Message-ID: >Scriptlets allow the componentization of scripting code so that you can >put less of it in your actual document and more of it in external >scripts. That seems to me to be a step in the right direction. A >scriptlet is a document in name only -- really it is a JavaScript >applet. I have no problem with that. They aren't meant to be indexed, >edited in WYSIWYG editors, formatted for print, added to website table >of contents or any of the other things that we expect to do with real >documents. Scripts have been 'componentizable' for several years now through the SRC attribute of the SCRIPT element. Scriptlets are not about componentizing JavaScript - they are about componentizing rich combinations of markup and script. While you could write a scriptlet in 100% script, you'd probably be wasting your time. Scriptlets may well have to be formatted for print and added to TOCs in certain situations. They aren't 'real documents' (however you want to define that), but they are considerably more than scripts. ---------------------------------- >How can I, in Javascript/Netscape 4.0 change the value of a text element >in my document by ID or element type, rather than by "textual >replacement" of JavaScript code by HTML code? The JavaScript model for >building HTML documents is textual replacement, not structural assembly. This is changing drastically. It's a piece of cake in JavaScript/Internet Explorer 4.0, at least for the ID. The contents of this material are subject to change. This is not a real application, obviously, and I haven't checked the code, but you get the idea of some of the things that are possible. Addressing elements by ID is a piece of cake - and much cleaner than document.write(). While the examples above are pretty useless hacks, in combination they can produce industrial-strength interfaces. (Microsoft of course did Asteroids, but it has considerable _practical_ use for creating client-side applications.) XML offers even more advantages to live pages than it does to static, providing a far more useful structure on which to build than HTML. I'm hoping that the W3C figures out the element type end of this and makes it more useful for XML. JavaScript may have begun as a hack, but it still has considerable strength. --------------------------- >But good luck indexing it properly! Have you ever done an AltaVista >search and got a hit like this: >16. JavaScript Games - MineSweeper >0) && (y1>=0) && (x1. =0) && (leftValue>=0)) markSquare (bottomValue, >leftValue) if ((topValue>=0) && (rightValue>=0)) openSquare(topValue, >rightValue) }.. > http://www.geocities.com/Wellesley/2159/minesweeper.html - size 13K - >13-Aug-97 - English XML should take care of this indexing issue, if the search engine developers simply tell their machines to ignore the content of SCRIPT tags. They could have done this with HTML, but chose to ignore markup intead. When they update to XML, they'll be paying more attention to markup. Structured indexing, not full-text indexing should be the indexing model for XML. ----------------------------------------- >> As I said before, we'll see what people actually do with the stuff soon >> enough. >Sure. And we'll see again what they decide to do after they have had >some more experience with the headaches of intermingling the two. It sounds like one of us has some severe static in his crystal ball. We can sort out which one of us it is in a few years. In the meantime, CDATA types for elements (or the lack thereof) are a more pressing issue. Simon St.Laurent Dynamic HTML: A Primer xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From ricko at allette.com.au Mon Oct 20 14:58:05 1997 From: ricko at allette.com.au (Rick Jelliffe) Date: Mon Jun 7 16:58:43 2004 Subject: Scripting and XML Message-ID: <199710201305.XAA30974@jawa.chilli.net.au> > From: Simon St.Laurent > It seems like the complexities of SGML that XML stripped away are still > haunting XML. Of course. But I think we should be aware that SGML came at the end of perhaps a fifteen year develop project involving thousands of documents at IBM and other places. So XML is the result of 25 years of continuous development. I think humility should make those of us with substantially less experience (time-wise and scale-wise) be careful not to label as irrelevant or excessive any SGML feature that we have not personally seen the need for! Which is not to say we cannot fruitfully bitch and clamour for what we need for own tasks, of course:-) That we all could have done it ever so much better goes without saying. If you are interested in improving SGML (which can flow through into XML) then contact ANSI or your local standards organisation and becoem part of the process. If you are interested in improving XML (which can flow through into SGML) then I guess join W3C or be vocal on this group! However, I think the XML 1.0 design is pretty much stabilized now. > Excellent. Now we know what the SGML developers were thinking - now we just > need to figure out why this is relevant to XML. Why is it so difficult to > create CDATA elements - which have to be marked clearly in XML by start and > end tags? It may be helpful to clarify what "Language" in SGML and XML means: it is not "something with a grammar" but "something directly readable by humans and editable with plain text editors". In other words, there can be no "binary" markup in SGML documents. So any solution to embed binary indexes to ends of binary sections is not SGML, because it is not human readable on a simple text editor. (It is just a fancy data storage format. I think the HyTime sBento provides a high-level interface to data storage of this kind, so you can use these from within SGML/XML and still be ISO standard, by the way. ) > There is no need in XML to stop CDATA at just any the this would probably break compatibility with all my favorite SGML parsers, at > least if I wrote scripts that used help end xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From SimonStL at classic.msn.com Mon Oct 20 15:41:11 1997 From: SimonStL at classic.msn.com (Simon St.Laurent) Date: Mon Jun 7 16:58:43 2004 Subject: Scripting and XML Message-ID: >All the contributors to this thread so far seem to have concentrated on >the difficulties posed by recognising the _end_ of a CDATA content - >while checking for a matching GI adds complexity, it is not impossible. >But one thing that *is* impossible is for a non-validating parser to >know that '<' followed by a name token is to be read as data, rather >than as a start-tag. You need to read the DTD to know the content is >CDATA. (Okay, you could use the rmd parameter of the XML declaration, >but encouraging authors to do this would remove some of XML's advantages. >One in particular being the ability to parse much of the document while >fetching the external subset) Thanks - that's an answer I can accept easily. I might enjoy forcing everyone to write valid XML documents, but I'll try to restrain myself. Simon St.Laurent Dynamic HTML: A Primer xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe xml-dev To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message; subscribe xml-dev-digest List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk) From papresco at technologist.com Mon Oct 20 16:09:13 1997 From: papresco at technologist.com (Paul Prescod) Date: Mon Jun 7 16:58:43 2004 Subject: Scripting and XML References: Message-ID: <344B6689.7475F89F@technologist.com> Simon St.Laurent wrote: > Scripts have been 'componentizable' for several years now through the SRC > attribute of the SCRIPT element. Not true. Componentization requires a separate namespace, setter and getter functions, display area management, function name conventions and so forth. Javascript has never had this before. Scriptlets are Javascript applets. They add the conventions of java.applet to scripts. > Scriptlets are not about componentizing > JavaScript - they are about componentizing rich combinations of markup and > script. While you could write a scriptlet in 100% script, you'd probably be > wasting your time. Not true. Java applets are "100% script scriptlets" written in Java. Luckily we can approach this question emperically. Microsoft has four sample scriptlets on ther website. The ratio of HTML to Javascript in each of them is exteremely low. The HTML that is provided is typically boilerplate: They typically have a mostly vacuous header: " Label Scriptlet